Polypeptide fragments capable of competition with Streptococcus mutans antigen I/II

ABSTRACT

Defined peptide subunits of Streptococcus mutans antigen I/II are useful as agents to prevent and treat dental caries either by eliciting an immunological response or by preventing adhesion of S. mutans to the tooth.

This invention relates to polypeptide fragments of the Streptococcus mutans I/II antigen that are useful in treating and preventing dental caries.

Streptococcus mutans is the main etiological agent of dental caries, a disease which affects mammals including humans.

The S. mutans I/II antigen (SA I/II) is a cell surface protein with an M_(r) of about 185 kDa. It is believed to comprise several antigenic epitopes and to be at least partly responsible for S. mutans adhesion to teeth.

SA I/II is described in British Patent No. 2,060,647, as are number antibodies to it. A putative 3.5 to 4.5 kDa fragment of SA I/II, "antigen X", has also been described in European Patent No. 0 116 472.

However, it has now become clear that "antigen X" is not a fragment of SA I/II at all. Rather, it is a separate protein that merely co-purifies with SA I/II. It is believed to be encoded by a separate gene.

Two large fragments of SA I/II, an N-terminal fragment (residues 39 to 481) and a 40 kDa central fragment (residues 816 to 1213) are recognised by human serum antibodies. Within the central fragment, 80% of the sera tested recognise elements within a proline-rich region (residues 839-955) that comprises three tandem repeats. This suggests that this region includes one or more B-cell epitopes. The central fragment (residues 816-1213) is also believed to comprise one or more adhesion sites that mediate S. mutans' attachment to the tooth.

The aim of the above-mentioned work has been the development of vaccines for immunisation against dental caries. However, precise identification of the antigenic epitopes within SA I/II is a prerequisite for designing synthetic vaccines based on it. Similarly, precise identification of adhesion sites is essential for the design of drugs against dental caries that rely on inhibiting S. mutans' adhesion to the tooth.

No antigenic epitopes (T-cell or B-cell epitopes) or adhesion sites within SA I/II have been characterised, nor has the precise location of any such regions been suggested. Also, there has been no indication of the location of S. mutans' T-cell epitopes as the above-mentioned work has concentrated on S. mutans' ability to adhere to teeth and to generate a B-cell response.

The inventors have identified a number of T-cell epitopes, B-cell epitopes and adhesion sites within residues 803 to 1114 of SA I/II. Some of the T-cell and B-cell epitopes overlap or are contiguous with each other and/or with one or more of the adhesion sites.

The presence of a number of antigenic epitopes of both types and a number of adhesion sites within the same region of SA I/II could not have been predicted and the finding that some of the adhesion sites and epitopes overlap or are contiguous with each other is particularly surprising.

These findings make it possible to design effective synthetic vaccines against dental caries as well as drugs that engender resistance against the disease or alleviate pre-existing cases of it by preventing S. mutans' adhesion to the tooth. Further, the surprising finding that some of the T-antigenic epitopes and the adhesion site are contiguous or overlapping makes it possible to design bifunctional drugs that effect immunisation against dental caries as well as preventing adhesion of S. mutans to the tooth.

Accordingly, the present invention provides a nucleic acid sequence which codes upon expression in a prokaryoic or eukaryotic host cell for a polypeptide product having one or more properties selected from (i) the ability to adhere to a mammalian tooth in a competitive manner with naturally occurring Streptococcus mutans antigen I/II, thus preventing or diminishing the adhesion of S.mutans to the tooth; (ii) the ability to stimulate a T-cell response; and (iii) the ability to stimulate a B-cell response, said nucleic acid sequence being selected from:

(a) the sequences shown in SEQ. ID. Nos. 12 to 22 or the complementary strands thereof;

(b) nucleic acid sequences having a length of not more than 1000 base pairs which hybridise to the sequences defined in (a) over at least 70% of their length;

(c) nucleic acid sequences having a length of not more than 1000 base pairs which, but for the degeneracy of the genetic code, would hybridise to the nucleic acid sequences defined in (a) or (b) over at least 70% of their length and which sequences code for polypeptides having the same amino acid sequence code, would hybridise to the nucleic acid sequences defined in (a) or (b) over at least 70% of their length and which sequences code for polypeptides having the same amino acid sequence.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1. Depiction of the panel of overlapping 20 mers used to map T-cell, B-cell and adhesion epitopes within SA I/II.

FIGS. 2A and 2B. Proliferative responses to overlapping synthetic peptides (20^(ers)) of SA I/II.

FIG. 2A Mean S.I. (±sem) of PBMC from 30 subjects. Mean cpm with medium only was 538±112.

FIG. 2B Frequency of positive responses (S.I.≧3.0, cpm>500).

FIG. 3. MHC class II dependency of proliferative responses to SA I/II.

FIGS. 4A-4D. Serum recognition of SA I/II and recombinant polypeptide fragments. Western blots from 3 subjects are shown in FIGS. 4A-4C together with rabbit anti-SA I/II antiserum as shown in FIG. 4D. Lanes, 1, SA I/II; lane 5, recombinant 984-1161.

FIG. 5. Human serum recognition of synthetic peptides of SA I/II. Titres were determined by ELISA in 22 subjects to selected peptides of SA I/II and an irrelevant control peptide from SIVp27(SIV). The frequencies of sera binding the peptides with a titre>mean±2S.D. the control peptide are also indicated.

FIGS. 6A and 6B. Inhibition of adhesion of S. mutans.

FIG. 6A SA I/II and recombinant fragment 984≠1161.

FIG. 6B Synthetic peptides.

FIG. 7. Proliferative responses of murine splenocytes following immunization with recombinant 975-1044 (SEQ. ID. No. 8)

FIG. 8. Competitive inhibition of SA I/II binding by various polypeptides.

FIG. 9. Dependence of competitive inhibition of SA I/II binding on concentration of two peptides.

FIG. 10. Effects of substitution of certain residues on competitive inhibition.

FIG. 11. Comparison of various recombinant polypeptides with respect to binding.

The polypeptides of the invention have one or more of the following properties. Firstly, they may have the ability to adhere to a mammalian tooth in a competitive manner with naturally occurring Streptococcus mutans antigen I/II, thus preventing or diminishing the adhesion of S. mutans to the tooth. Some of the peptides of the invention have been shown to inhibit adhesion of S. mutans to a tooth surface model (whole human saliva adsorbed to the wells of polystyrene microtitre plates or hydroxyapatite beads). Thus, these peptides comprise one or more adhesion sites and will adhere to a mammalian tooth in a competitive manner with naturally occurring SA I/II. Therefore, peptides according to the invention that comprise the adhesion site prevent or diminish the adhesion of S. mutans to the tooth. Peptides of the invention that comprise one or more adhesion epitopes include SEQ. ID. Nos. 1 to 6 and 8 to 10.

Secondly, peptides according to the invention may have the ability to stimulate a T-cell response. The inventors have shown that residues 803 to 854 and 925 to 1114 of SA I/II comprise a number of T-cell epitopes that are at least partially responsible for the T-cell response stimulated by the intact protein. Therefore, peptides according to the invention that comprise one or more of these the T-cell epitopes stimulate a T-cell response against S. mutans infection. Peptides of the invention that stimulate a T-cell response include those shown in SEQ ID Nos. 1 to 11.

Thirdly, the peptides of the invention may stimulate a B-cell response. The inventors have shown that residues 803 to 854 and 925 to 1114 of SA I/II comprise a number of B-cell epitopes and polypeptides according to the invention that comprise one or more B-cell epitopes stimulate a B-cell response against S. mutans infection. Peptides of the invention that comprise one or more B-cell epitopes include those shown in SEQ. ID. Nos. 1, 3 to 7 and 10.

The nucleic acid sequences of the present invention are preferably DNA, though they may be RNA. It will be obvious to those of skill in the art that, in RNA sequences according to the invention, the T residues shown in SEQ. ID. Nos. 12 to 22 will be replaced by U. Nucleic acid sequences of the invention will typically be in isolated or substantially isolated form. For example up to 80, up to 90, up to 95 or up to 100% of the nucleic acid material ir a preparation of a nucleic acid of the invention will typically be nucleic acid according to the invention.

Some preferred nucleic acid sequences of the invention are those shown in SEQ. ID. Nos. 12 to 22. However, the nucleic acid sequences of the present invention are not limited to these sequences. Rather, the sequences of the invention include sequences that are closely related to these sequences and that encode a polypeptide having at least one of the biological properties of naturally occurring SA I/II. These sequences may be prepared by altering those of SEQ ID Nos. 12 to 22 by any conventional method, or isolated from any organism or made synthetically. Such alterations, isolations or syntheses may be performed by any conventional method, for example by the methods of Sambrook et al (Molecular Cloning: A Laboratory Manual; 1989)

For example, the sequences of the invention include sequences that are capable of selective hybridisation to those of SEQ. ID. Nos. 12 to 22 or the complementary strands thereof and that encode a polypeptide having one or more of the properties defined above. Such sequences capable of selectively hybridizing to the DNA of SEQ. ID. Nos. 12 to 22 will generally be at least 70%, preferably at least 80 or 90% and more preferably at least 95% homologous to the DNA of SEQ. ID. Nos. 12 to 22 over a region of at least 10, preferably at least 20, 30, 40, 50 or more contiguous nucleotides.

Such sequences that hybridise to those shown in SEQ. ID. Nos. 12 to 22 will typically be of similar size to them, though they may be longer or shorter. However, if they are longer, they may not simply encode large fragments of native SA I/II amino acid sequence. Thus, sequences that hybridise to those of SEQ. ID. Nos. 12 to 22 may be sequences of up to 1000 bases in length, for example up to 950 or up to 933 bases in length, 933 bases being the length of the DNA sequence encoding the largest specifically identified peptide of the invention (SEQ. ID. No. 21). Also, sequences that hybridise to those of SEQ. ID. Nos. 12 to 22 must do so over at least 50% of their length, for example up to 60%, up to 70%, up to 80%, up to 90%, up to 95%, or up to 99% of their length.

Such hybridisation may be carried out under any suitable conditions known in the art (see Sambrook et al (1989): Molecular Cloning: A Laboratory Manual). For example, if high stringency is required, suitable conditions include 0.2×SSC at 60° C. If lower stringency is required, suitable conditions include 2×SSC at 60° C.

Also included within the scope of the invention are sequences that differ from those defined above because of the degeneracy of the genetic code and encode the same polypeptide having one or more of the properties defined above, namely the polypeptide of SEQ. ID. Nos. 1 to 11 or a polypeptide related to one of these polypeptides in any of the ways defined below.

Thus, the nucleic acid sequences of the invention include sequences which, but for the degeneracy of the genetic code, would hybridise to those shown in SEQ. ID. Nos. 12 to 22 or the complementary strands thereof. However, such sequences may not simply encode large fragments of native SA I/II amino acid sequences. Thus, these sequences may be up to 1000 bases in length, for example up to 950 or 933 bases in length. Also, their sequence must be such that, but of the degeneracy of the genetic code, they would hybridise to a sequence as shown in SEQ. ID. Nos. 12 to 22 over at least 50% of their length, for example, up to 60%, up to 70%, up to 80%, up to 90%, up to 95% or up to 99% of their length.

Also, the nucleic acid sequences of the invention include the complementary strands of the sequences defined above, for example the complementary strands of the nucleic acid sequences shown in SEQ. ID. Nos. 12 to 22.

Nucleic acid sequences of the invention will preferably be at least 30 bases in length, for example up to 50, up to 100, up to 200, up to 300, up to 400, up to 500, up to 600, up to 800 or up to 1000 bases.

Nucleic acid sequences of the invention may be extended at either or both of the 5' and 3' ends. Such extensions may be of any length. For example, an extension may comprise up to 10, up to 20, up to 50, up to 100, up to 200 or up to 500 or more nucleic acids. A 5' extension may have any sequence apart from that which is immediately 5' to the sequence of the invention (or the native sequence from which it is derived) in native SA I/II. A 3' extension may have any sequence apart from that which is 3' the sequence of SEQ. ID. No. 13 in native SA I/II. Thus, the nucleic acid sequences of the invention may be extended at either or both of the 5' and 3' ends by any non-wild-type sequence.

The polypeptides of the invention are encoded by the DNA sequences described above. Thus, the polypeptides of the invention are not limited to the polypeptides of SEQ. ID. Nos. 1 to 11 although these sequences represent preferred polypeptides. Rather, the polypeptides of the invention also include polypeptides with sequences closely related to those of SEQ. ID. Nos. 1 to 11 that have one or more of the biological properties of SA I/II. These sequences may be prepared by altering those of SEQ ID Nos. 1 to 11 by any conventional method, or isolated from any organism or made synthetically. Such alterations, isolations or syntheses may be performed by any conventional method, for example by the methods of Sambrook et al (Molecular Cloning: A Laboratory Manual; 1989). In particular, polypeptides related to those of SEQ ID Nos. 1 to 11 may be prepared by modifying DNA sequences as shown in SEQ ID Nos. 12 to 22 expressing them recombinantly.

The polypeptides of the invention may be encoded by nucleic acid sequences that have less than 100% sequence identity with those of SEQ. ID. Nos. 12 to 22. Thus, polypeptides of the invention may include substitutions, deletions, or insertions, that distinguish them from SEQ. ID. Nos. 1 to 11 as long as these do not destroy the biological property or properties that the polypeptides have in common with SA I/II.

A substitution, deletion or insertion may suitably involve one or more amino acids, typically from one to five, one to ten or one to twenty amino acids, for example, a substitution, deletion or insertion of one, two, three, four, five, eight, ten, fifteen, or twenty amino acids. Typically, a polypeptide of the invention has at least 60% at least 80%, at least 90%, or at least 95% sequence identity to the sequence of any one of SEQ. ID. Nos. 1 to 11.

In general, the physicochemical nature of the sequence of SEQ. ID. Nos. 1 to 11 should be preserved in a polypeptide of the invention. Such sequences will generally be similar in charge, hydrophobicity and size to that of SEQ. ID. Nos. 1 to 11. Examples of substitutions that do not greatly affect the physicochemical nature of amino acid sequences are those in which an amino acid from one of the following groups is substituted by a different amino acid from the same group:

H, R and K

I, L, V and M

A, G, S and T

D, E, Q and N.

However, it may be desirable to alter the physicochemical nature of the sequence of SEQ. ID. Nos. 1 to 11 in order to increase its therapeutic effectiveness. For example, many of the amino acids in the polypeptides of the invention are acidic. For example, residues 975 to 1044 (SEQ. ID. No. 8) as a whole are of an acidic nature. This acidity is believed to facilitate binding to a mammalian tooth. Thus, it may be desirable to increase the acidity of polypeptides of the invention by adding acidic residues or by substituting acidic residues for non-acidic ones. Acidic residues include aspartic acid and glutamic acid.

Where polypeptides of the invention are synthesised chemically, D-amino acids (which do not occur in nature) may be incorporated into the amino acid sequence at sites where they do not affect the polypeptides biological properties. This reduces the polypeptides' susceptibility to proteolysis by the recipient's proteases.

The nucleic acid sequences encoding the polypeptides of the invention may be extended at one or both ends by any non-wild-type sequence.

Thus, the polypeptides of the invention may be extended at either or both of the C- and N- termini by an amino acid sequence of any length. For example, an extension may comprise up to 5, up to 10, up to 20, up to 50, or up to 100 or 200 or more amino acids. An N-terminal extension may have any sequence apart from that which is N-terminal to the sequence of SEQ. ID. No. the invention (or the native sequence from which it is derived) in native SA I/II. A C-terminal extension may have any sequence apart from that which is C-terminal to the sequence of the invention (or the native sequence from which it is derived) in native SA I/II. Thus, the polypeptides of the invention may be extended at either or both of the C- and N- termini by any non-wild-type sequence.

The polypeptides of the invention may be attached to other polypeptides or proteins that enhance their antigenic properties. Thus, polypeptides of the invention may be attached to one or more other antigenic polypeptides. These additional antigenic polypeptides may be derived from S. mutans or from another organism. Possible additional antigenic polypeptides include heterologous T-cell epitopes derived from other S. mutans proteins or from species other than S. mutans. Heterologous B-cell epitopes may also be used. Such heterologous T-cell and or B-cell epitopes may be of any length and epitopes of up to 5, up to 10 or up to 20 amino acids in length are particularly preferred. These additional antiqenic polypeptides may be attached to the polypeptides of the invention chemically. Alternatively, one or more additional antigenic sequences may comprise an extension to a polypeptide of the invention.

A polypeptide of the invention may be subjected to one or more chemical modifications, such as glycosylation, sulphation, COOH-amidation or acylation. In particular, polypeptides that are acetylated at the N-terminus are preferred, as are polypeptides having C-terminal amide groups. Preferred polypeptides may have one or more of these modifications. For example, particularly preferred peptides may have a C-terminal amide group and N-terminal acetylation.

A polypeptide of the invention may form part of a larger polypeptide comprising multiple copies of the sequence of one or more of SEQ. ID. Nos. 1 to 11 or a sequences related to them in any of the ways defined herein.

Polypeptides of the invention typically comprise at least 15 amino acids, for example 15 to 20, 20 to 50, 50 to 100 or 100 to 200 or 200 to 300 amino acids. Preferred polypeptides include those shown in SEQ. ID. Nos. 1 to 11.

Polypeptides according to the invention may be purified or substantially purified. Such a polypeptide in substantially purified form will generally form part of a preparation in which more than 90%, for example up to 95%, up to 98% or up to 99% of the peptide material in the preparation is that of a polypeptide or polypeptides according to the invention.

The nucleic acid sequences and polypeptides of the invention were originally derived from S. mutans. However, nucleic acid sequences and/or polypeptides of the invention may also be obtained from other organisms, typically bacteria, especially other streptococci. They may be obtained either by conventional cloning techniques or by probing genomic or cDNA libraries with nucleic acid sequences according to the invention. This can be done by any conventional method, such as the methods of Sambrook et al (Molecular Cloning: A Laboratory Manual; 1989).

A nucleic acid sequence according to the invention may be included within a vector, suitably a replicable vector, for instance a replicable expression vector.

A replicable expression vector comprises an origin of replication so that the vector can be replicated in a host cell such as a bacterial host cell. A suitable vector will also typically comprise the following elements, usually in a 5' to 3' arrangement: a promoter for directing expression of the nucleic acid sequence and optionally a regulator of the promoter, a translational start codon and a nucleic acid sequence according to the invention encoding a polypeptide having one or more of the biological properties of SA I/II. A non-replicable vector lacks a suitable origin at replication whilst a non-expression vector lacks an effective promoter.

The vector may also contain one or more selectable marker genes, for example an ampicillin resistance gene for the identification of bacterial transformants. One particular preferred marker gene is the kanamycin resistance gene. Optionally, the vector may also comprise an enhancer for the promoter. If it is desired to express the nucleic acid sequence of the invention in a eucaryotic cell, the vector may also comprise a polyadenylation signal operably linked 3' to the nucleic acid encoding the functional protein. The vector may also comprise a transcriptional terminator 3' to the sequence encoding the polypeptide of the invention.

The vector may also comprise one or more non-coding sequences 3' to the sequence encoding the polypeptide of the invention. These may be from S. mutans (the organism from which the sequences of the invention are derived) or the host organism which is to be transformed with the vector or from another organism.

In an expression vector, the nucleic acid sequence of the invention is operably linked to a promoter capable of expressing the sequence. "Operably linked" refers to a juxtaposition wherein the promoter and the nucleic acid sequence encoding the polypeptide of the invention are in a relationship permitting the coding sequence to be expressed under the control of the promoter. Thus, there may be elements such as 5' non-coding sequence between the promoter and coding sequence. These elements may be native either to S. mutans or to the organism from which the promoter sequence is derived or to neither organism. Such sequences can be included in the vector if they enhance or do not impair the correct control of the coding sequence by the promoter.

The vector may be of any type. The vector may be in linear or circular form. For example, the vector may be a plasmid vector. Those of skill in the art will be able to prepare suitable vectors comprising nucleic acid sequences encoding polypeptides of the invention starting with widely available vectors which will be modified by genetic engineering techniques such as those described by Sambrook et al (Molecular Cloning: A Laboratory Manual; 1989). Preferred starting vectors include plasmids that confer kanamycin resistance and direct expression of the polypeptide of the invention via a tac promoter.

In an expression vector, any promoter capable of directing expression of a sequence of the invention in a host cell may be operably linked to the nucleic acid sequence of the invention. Suitable promoters include the tac promoter.

Such vectors may be used to transfect or transform a host cell. Depending on the type of vector, they may be used as cloning vectors to amplify DNA sequences according to the invention or to express this DNA in a host cell.

A further embodiment of the invention provides host cells harbouring vectors of the invention, i.e. cells transformed or transfected with vectors for the replication and/or expression of nucleic acid sequences according to the invention, including the sequences shown in SEQ. ID. Nos. 12 to 22. The cells will be chosen to be compatible with the vector and may for example be bacterial cells. Transformed or transfected bacterial cells, for example E. coli cells, will be particularly useful for amplifying nucleic acid sequences of the invention as well as for expressing them as polypeptides.

The cells may be transformed or transfected by any suitable method, such as the methods described by Sambrook et al (Molecular cloning: A Laboratory Manual; 1989). For example, vectors comprising nucleic acid sequences according to the invention may be packaged into infectious viral particles, such as retroviral particles. The constructs may also be introduced, for example, by electroporation, calcium phosphate precipitation, biolistic methods or by contacting naked nucleic acid vectors with the cells in solution.

In the said nucleic acid vectors with which the host cells are transformed or transfected, the nucleic may be DNA or RNA, preferably DNA.

The vectors with which the host cells are transformed or transfected may be of any suitable type. The vectors may be able to effect integration of nucleic acid sequences of the invention into the host cell genome or they may remain free in the cytoplasm. For example, the vector used for transformation may be an expression vector as defined herein.

The present invention also provides a process of producing polypeptides according to the invention. Such a process will typically comprise transforming or transfecting host cells with vectors comprising nucleic acid sequences according to the invention and expressing the nucleic acid sequence in these cells. In this case, the nucleic acid sequence will be operably linked to a promoter capable of directing its expression in the host cell. Desirably, such a promoter will be a "strong" promoter capable of achieving high levels of expression in the host cell. It may be desirable to overexpress the polypaptide according to the invention in the host cell. Suitable host cells for this purpose include yeast cells and bacterial cells, for example E. coli cells, a particularly preferred E. coli strain being E. coli K12 strain BL 21. However, other expression systems can also be used, for example baculovirus systems in which the vector is a baculovirus having in its genome nucleic acid encoding a polypeptide of the invention and expression occurs when the baculovirus is allowed to infect insect cells.

The thus produced polypeptide of the invention may he recovered by any suitable method known in the art. Optionally, the thus recovered polypeptide may be purified by any suitable method, for example a method according to Sambrook et al (Molecular Cloning: A Laboratory Manual).

The polypeptides of the invention may also be synthesised chemically using standard techniques of peptide synthesis. For shorter polypeptides, chemical synthesis may be preferable to recombinant expression. In particular, peptides of up to 20 or up to 40 amino acid residues in length may desirably be synthesised chemically.

The nucleic acid sequences of the invention may be used to prepare probes and primers. These will be useful, for example, in the isolation of genes having sequences similar to that of SEQ. ID. No. 24. Such probes and primers may be of any suitable length, desirably from 10 to 100, for example from 10 to 20, 20 to 50 or 50 to 100 bases in length.

The present invention also provides antibodies to the polypeptides of the invention. These antibodies may be monoclonal or polyclonal. For the purposes of this invention, the term "antibody", includes fragments of whole antibodies which retain their binding activity for a target antigen. Such fragments include Fv, F(ab') and F(ab')₂ fragments, as well as single chain antibodies.

The antibodies may be produced by any method known in the art, such as the methods of Sambrook et al (Molecular Cloning: A Laboratory Manual; 1989). For example, they may be prepared by conventional hybridoma techniques or, in the case of modified antibodies or fragments, by recombinant DNA technology, for example by the expression in a suitable host vector of a DNA construct encoding the modified antibody or fragment operably linked to a promoter. Suitable host cells include bacterial (for example E. coli), yeast, insect and mammalian cells. Polyclonal antibodies may also be prepared by conventional means which comprise inoculating a host animal, for example a rat or a rabbit, with a peptide of the invention and recovering immune serum.

The present invention also provides pharmaceutical compositions comprising polypeptides of the invention. Three types of pharmaceutical compositions are particularly preferred. Firstly, compositions comprising polypeptides of the invention that include T-cell and/or B-cell epitopes may be used as vaccines against dental caries. Secondly, compositions comprising polypeptides of the invention that comprise adhesion sites will prevent or diminish adhesion of S. mutans to the tooth and can be used in the treatment of pre-existing cases of dental caries. Thirdly, compositions comprising polypeptides of the invention that include both one or more antigenic (T-cell or B-cell) epitopes and one or more adhesion epitopes can be used to effect vaccination against dental caries at the same time as caring pre-existing cases of the disease. A similar effect can be achieved by including in a composition one (or more peptides comprising one or more antigenic epitopes and one or more peptides comprising one or more adhesion sites.

A range of mammalian species can be vaccinated against dental caries using the polypeptides of the invention. Vaccination of humans is particularly desirable.

The compositions of the invention may be administered to mammals including humans by any route appropriate. Suitable routes include topical application in the mouth, oral delivery by means of tablets or capsule and parenteral delivery, including subcutaneous, intramuscular, intravenous and intradermal delivery. Preferred routes of administration are topical application in the mouth and injection, typically subcutaneous or intramuscular injection, with a view to effecting systemic immunisation.

As previously indicated, polypeptides according to the invention may also be mixed with other antigens of different immunogenicity.

The compositions of the invention may be administered to the subject alone or in a liposome or associated with other delivery molecules. The effective dosage depends on many factors, such as whether a delivery molecule is used, the route of delivery and the size of the mammal being vaccinated. Typical doses are from 0.1 to 100 mg of the polypeptide of the invention per dose, for example 0.1 to 1 mg, and 1 to 5 mg, 5 to 10 mg and 10 to 100 mg per dose. Doses of from 1 to 5 mg are preferred.

Dosage schedules will vary according to, for example, the route of administration, the species of the recipient and the condition of the recipient. However, single doses and multiple doses spread over periods of days, weeks or months are envisaged. A regime for administering a vaccine composition of the invention to young human patients will conveniently be :6 months, 2 years, 5 years and 10 years, with the initial dose being accompanied by adjuvant and the subsequent doses being about 1/2 to 1/4 the level of polypeptide in the initial dose. The frequency of administration can, however, be determined by monitoring the antibody levels in the patient.

Where the peptides of the invention are to be applied topically in the mouth, one preferred dosage regime is to apply one or more polypeptides of the invention on two or more occasions, for example 2 to 10 occasions over a period of a few weeks, for example one to six weeks. A particularly preferred regime of this type involves six applications of a polypeptide of the invention over a period of three weeks.

Typical doses for each topical application are in the range of 0.1 to 100 mg for example 0.1 to 1 mg, 1 to 10 mg and 10 to 100 mg. Doses of from 1 to 5 mg for each application are preferred.

While it is possible for polypeptides of the invention to be administered alone it is preferable to present them as pharmaceutical formulations. The formulations of the present invention comprise at least one active ingredient, a polypeptide of the invention, together with one or more acceptable carriers thereof and optionally other therapeutic ingredients. The carrier or carriers must be "acceptable" in the sense of being compatible with the other ingredients of the formulation and not deleterious to the recipients thereof, for example, liposomes.

Formulations suitable for parenteral administration include aqueous and non-aqueous sterile injection solutions which may contain anti-oxidants, buffers, bacteriostatis, bactericidal antibiotics and solutes which render the formulation isotonic with the blood of the intended recipient; and aqueous and non-aqueous sterile suspensions which may include suspending agents and thickening agents, and liposomes or other microparticulate systems which are designed to target the compound to blood components or one or more organs.

In particular, the polypeptides of the invention may be coupled to lipids or carbohydrates. This increases their ability to adhere to teeth, either by prolonging the duration of the adhesion or increasing its affinity, or both. This is particularly desirable for shorter polypeptides of the invention, which comprise up to around 40 amino acid residues.

Of the possible formulations, sterile pyrogen-free aqueous and non-aqueous solutions are preferred. Also preferred are formulations in which the polypeptides of the invention are contained in liposomes. Injection solutions and suspensions may be prepared extemporaneously from sterile powders, granules and tablets of the kind previously described.

Oral methods of administration may produce an effect systemically or locally in the mouth. Orally active preparations can be formulated in any suitable carrier, such as a gel, toothpaste, mouthwash or chewing gum.

It should be understood that in addition to the ingredients particularly mentioned above the formulations of this invention may include other agents conventional in the art having regard to the type of formulation in question.

Accordingly, the present invention provides a method of vaccinating a mammalian host against dental caries or treating dental caries, which method comprises administering to the host an effective amount of a pharmaceutical composition as described above, for example a vaccine composition.

Antibodies, including monoclonal antibodies, can be formulated for passive immunisation as indicated above for the formulation of including polypeptides of the invention. Preferred formulations for passive immunisation include solid or liquid formulations such as gels, toothpastes, mouth-washes or chewing gum.

A further aspect of the present invention is a naked nucleic acid vaccine. In this embodiment, the vaccine composition comprises a nucleic acid, typically an isolated nucleic acid, preferably DNA, rather than a polypeptide. The nucleic acid is injected in to a mammalian host and expressed in vivo, generating a polypeptide of the invention. This stimulates a T-cell response, which leads to protective immunity against dental caries in the same way as direct vaccination with a polypeptide of the invention.

Naked nucleic acid vaccination can be carried out with any nucleic acid according to the invention as long as it encodes a polypeptide that stimulates a T-cell and or B-cell response. Preferred nucleic acids are those shown in SEQ. ID Nos. 1 to 11. These will typically be included within an expression vectors as defined above. In such an expression vector, the nucleic acid according to the invention will typically be operably linked to a promoter capable of directing its expression in a mammalian host cell. For example, promoters from viral genes that are expressed in the mammalian cells such as the cytomegalovirus (CMV) immediate early gene promoter are suitable. Also suitable are promoters from mammalian genes that are expressed in many or all mammalian cell types such as the promoters of "housekeeping" genes. One such promoter is the p-hydroxymethyl-CoA-reductase(HMG) promoter (Gautier et al (1989): Nucleic Acids Research; 17,8839).

For naked nucleic acid vaccination, it is preferred that the nucleic acid sequence according to the invention is incorporated into a plasmid vector, since it has been found that covalent closed circle (CCC) plasmid DNA can be taken up directly by muscle cells and expressed without being integrated into the cells' genomic DNA (Ascadi et al (1991): The New Biologist; 3, 71-81). Naked nucleic acid vaccine may be prepared as any of the types of formulation mentioned above in respect of conventional polypeptide-based vaccines. However, formulations suitable for parenteral injection, especially intramuscular injection, are preferred. Naked nucleic acid vaccines may be delivered in any of the ways mentioned above in respect of conventional polypeptide-based vaccines but intramuscular injection is preferred.

Accordingly, the present invention provides a vaccine composition comprising a nucleic acid sequence or vector as described above and an acceptable carrier.

The following examples illustrate the invention.

EXAMPLES Materials and Methods

Materials

Fmoc amino acids, benzotrlazole-1-yl-oxy-trispyrrolidino-phosphonium hexaflurophosphate (PyBOP) and Rink Amide MBHA resin were purchased from Calbiochem-Novabiochem (UK) Ltd., (Nottingham, UK) Dimethylformamide, trifluoroacetic acid, diethyl ether, dichloromethane and piperidene were purchased from Romil Chemicals Ltd (Loughborough, UK). Di-isopropylethylamine was from Aldrich Chemical Co. (Dorset, UK). Oligonucleotides were purchased from Oswel DNA service (University of Edinburgh, Edinburgh, UK).

Bacteria and Growth Conditions

S.mutans Guy's strain (serotype c) were grown in 10 L basal medium supplemented as described previously (Russel et al (1978): Arch. Oral Biol., 2317; Russel et al (1980): Infect, Immun. 61, 5490) at 37° C. for 72 h for SA I/II preparation. For the adhesion assay, S. mutans were grown in Todd-Hewitt broth (Difco Laboratories, Detroit, Mich.). Escherichia coli BL21 (DE3) (Novagen Inc., Madison, Wis.) harbouring pET15b were grown at 37° C. in Luria-Bertani broth supplemented with carbenicillin (50 μg/ml) and recombinant protein expression was induced with isoppropyl-β-D-thiogalactopryanoside (1 mM).

Antigens

SA I/II was prepared from S. mutans (serotype c, Guy's strain) as described by Russel et al (1980: Infect. Immun. 28, 486). Using the procedure of Munro et al (1993: Infect. Immun. 61, 4590), the portion of the gene encoding residues 984-1161 was amplified by using the oligonucleotide primers: (5') ATACATATGCCAACGTTCATTTCCATTACTTT (SEQ. ID. No. 25) and (3') GCCATTGTCGACTCATTCATTTTTATTAACCTTAGT (SEQ. ID. No. 26), cloned into pET15b (modified by the addition of a Sal I site) and expressed in E. coli.

Synthetic Peptides

Peptide amides (20 mers overlapping by 10 residues) were synthesised on Rink amide MBHA resin in sealed porous polypropylene bags by the manual simultaneous multiple peptide synthesis procedure (Houghten (1985) PNAS 892, 5131) using Fmoc chemistry. PyBOP was used as coupling agent and Fmoc amino acids were activated In situ by addition of diisopropylethylamine. Following 20 cycles of synthesis, resin was washed with dimethylformamide followed by dichloromethane and peptides were cleaved by incubation in trifluoroacetic acid-ethanedithiol-anisolephenol-H₂ O (82.5:2.5:5:5:5; v/v/v/w/v) for 2 h at room temperature. Peptides were precipitated by the addition of 5 volumes ether, recovered by centrifugation and washed three times with ether. Finally, peptides were dissolved in water and lyophilised. The scale of synthesis was 50 μmol. Aliquots of each peptide were hydrolysed in 6M HCl at 110° C. for 24 h and compositions were determined using the Beckman 121MB automated analyser (Beckman Instruments Ltd, Bucks, UK). In each case the composition matched that predicted.

Antibodies

MAbs, L243 (anti-MHC class II) and W6/32 (anti-MHC class I) were produced from cultures of hybridomas obtained from the American Type Culture Collection (Rockville, Md, USA). ID4 an isotype (IgG2a) matched control of irrelevant specificity was provided by Dr. P. Shepherd (Department of Immunology, UMDS, Guys Hospital, London, UK). Rabbit anti-SA I/II antiserum was prepared as described previously (Russel et al (1980). Infect. Immun. 28, 486).

Lymphoproliferative Assay

Defibrinated blood from volunteers was separated on a Ficoll gradient. Sera was used for antibody assays (see below) while peripheral blood mononuclear cells (PBMCs) were washed and resuspended in RPMI 1640 (Sigma Chemical Co., St. Louis, Mo., USA) supplemented with 2 mM L-glutamine, penicillin (100 IU/ml), streptomycin sulphate (100 μg/ml) and 10% heat-inactivated autologous serum. PBMCs (10⁵ cells/well) were cultured in 96-well round-bottomed plates (Costar, Cambridge, Mo., USA) in a total volume of 200 μl. Three replicates of each culture were incubated with three concentrations (1, 10 and 40 μg/ml) of SA I/II, recombinant fragments, non-recombinant control or synthetic peptides. Incubation was at 37° C. in a humidified atmosphere with 5% CO₂ for 6 days. Each culture received 0.2 μCi (7.4 kBq) of [³ H]-thymidine (Amersham International, Bucks, UK) 6 h before harvesting. Cultures were harvested onto glass fibre filters using a Dynatech (Chantilly, Va., USA) Minimal Cell harvester and [³ H]-thymidine incorporation was measured using the LKB liquid scintillation counter (Bromma, Sweden). Proliferation was expressed as stimulation index which is mean counts per minute (cpm) of antigen-stimulated, divided by, cpm of antigen-free cultures. Concanavalin A (10 μg/ml) (Sigma Chemical Co., St. Louis, Mo., USA) was used with every culture as a positive control but the results are not presented.

MHC dependency of proliferative responses to SA I/II was determined by culturing cells with antigen (10 μg/ml) as above in the presence of MAbs L235, W6/32 or ID4 at 1, 10 and 20 μg/ml. Cultures were incubated with [³ H]-thymidine, harvested and [³ H]-thymidine uptake was determined as described above.

ELISA for Serum Antibodies

Antibody recognition of synthetic peptides was determined by ELISA. Peptides (10 μg/ml) in phosphate buffered saline (PBS) were adsorbed to wells of polystyrene microtitre plates (Dynatech) for 2 h at room temperature. Plates were washed and wells were treated with 1.5% (w/v) bovine serum albumen (BSA) for 1 h at room temperature to block unbound sites. After washing, bound peptides were incubated with serially diluted sera in duplicate. Bound IgG antibodies were determined by incubation with alkaline phosphate conjugated-goat anti-human Ig (Sigma Chemical Co.) and subsequent reaction with paranitrophenyl phosphate (Sigma Chemical Co.). Plates were read at 405 nm using the microplate reader model 450 (Bio-Rad). After initial screening, the assay was repeated at least 3 times with each serum using a restricted set of peptides. SA I/II (2 μg/ml) was included in each assay as was an irrelevant peptide (HQAAMQIIRDIINEEAADWD (SEQ. ID. No. 27) derived from the sequence of SIV p27. Results are expessed as the highest dilution giving an absorbance ≧0.2.

Western Blotting

Serum antibody responses were also assayed by Western blotting using SA I/II, the recombinant polypeptides and a control fraction from E. coli BL21 harbouring non-recombinant pET15b. Purified antigens were separated by sodium dodecyl sulphate-polyacrylamide gel electrophoresis (SDS-PAGE) with gels of 10% acrylamide, by using a mini-gel system (Hoeffer Scientific Instruments, San Francisco, Calif., USA). Proteins were transferred to nitrocellulose with a semi-dry blotter (Sartorius A. G., Gottingen, Germany). Nitrocellulose strips were blocked with 5% (wt/vol) nonfat milk powder 2.5% (wt/vol) BSA in Tris-HCl-buffered saline (pH 8.0) containing 0.05% (wt/vol) Tween 20. Strips were subsequently incubated with human sera (1 in 20 dilution) or rabbit anti-SA I/II antiserum (10⁻⁴ dilution) and bound antibody was visualised by using alkaline phosphatase-conjugated secondary antibody with 5-bromo-4-chloro-3-indolylphosphate and nitroblue tetrazolium (Sigma Chemical Co.) as substrates. Each sera was assayed three times and responses were considered as positive if bands were visible in at least two assays.

Bacterial Adherence Assay

SA I/II mediated adherence of S. mutans (Guy's strain) to saliva was assayed by determining binding of [³ H]-thymidine labelled bacteria to saliva adsorbed to microtitre wells. Freshly collected human saliva from a single donor was clarified by centrifugation for 10 min at 3000 g, heat-inactivated at 60° C. for 30 min and finally clarified by centrifugation at 17,000 g for 20 min. Treated saliva was diluted with an equal volume of PBS and adsorbed to the wells of a polystyrene 96-well flat-bottomed microtitre plate (Immulon 4; Dynatech) for 2 h at room temperature. After coating, wells were washed three times with PBS and unbound sites were blocked by incubation with 1.5% (wt/vol) BSA in PBS for 1 h at room temperature. Plates were then washed three times with 50 mM KCl-1 mM CaCl₂ -38 mM MgCl₂ -1 mM KH₂ PO₄ -1.2 mM K₂ PO₄ (pH 7.2; adherence buffer). S. mutans cells from an overnight culture in Todd-Hewitt broth were used to inoculate (1/10 volume) a further culture in Todd-Hewitt broth containing 100 μCi (3.7 MBq) [³ H]-thymidine (Amersham International plc) per ml. Cells were harvested in late log phase (O.D. 700 nm approximately 0.4) pelleted by centrifugation at 100 g for 10 min and washed three times in adherence buffer. The final suspension was vortexed with 0.5 volume glass beads to break up chains of cocci which was monitored microscopically (Munro et al (1993): Infect. Immun. 61, 4590). Cells were resuspended to 5×10⁴ c.p.m. per 50 μl and BSA was added to 1.5% (wt/vol). Specific activity of the washed S. mutans cells was estimated to be 1.3×10⁻³ c.p.m. per cell (Munro et al (1985): Infect. Immun. 61, 4590). In competitive inhibition of adherence, the various synthetic peptides were added to the wells (at final concentrations 62.5-500 μM) in 50 μl adherence buffer containing 1.5% (wt/vol) BSA together with 50 μl radiolabelled S. mutans suspension. Microtitre plates were incubated at 37° C. for 2 h with gentle shaking and subsequently were washed ten times with adherence buffer. Bound S. mutans cells were eluted with 1% (wt/vol) SDS and transferred to glass fibre filters by using the Micromate 196 cell harvester (Canberra Packard, Berks, UK). Filters were counted using the Matrix 96 direct beta counter (Canberra Packard). Background binding was determined on wells to which no saliva was adsorbed. The percentage of binding of S. mutans to saliva was calculates by the formula [(test c.p.m.)--(control c.p.m.)/ total c.p.m.]×100. Percent inhibition of adherence was calculated as [(percent adherence without inhibitor-percent adherence with inhibitor)/percent adherence without inhibitor]×100. For proteins, determinations of streptococcal adhesion were made in triplicate or quadruplicate at each protein concentration while for peptides, duplicate determinations were made. In each case the assay was performed at least three times.

Statistics

The student's t test was used to analyse results.

Example 1 Preparation of a Panel of Overlapping Synthetic Peptides and Analysis of their Properties

T Cell Epitope Mapping

A panel of 32 overlapping synthetic peptides, spanning residues 803-1174 of SA I/II, was prepared, as described above (See FIG. 1). Proliferative responses of PBMCs from 30 subjects were determined by stimulation with peptides (see FIG. 2). All subjects responded to at least one peptide with a band range of 1-8 peptides, and a mean of 4.4 peptides. On the basis of frequency of response to each peptide (SI≧3.0 c.p.m.>500) 3 immunodonminant epitopes were identified; peptides 803-822, 975-994 and 985-1004, each yielding frequencies>50% (FIG. 1). Since most (13/15) subjects who responded to peptide 975-994 also responded to peptide 985-1004, it is probable that a single T-cell epitope is present within residues 975-1004. Minor T cell epitopes were also identified within peptides 1005-1024, 1015-1034, 1085-1104 and 1115-1134 with frequencies>20% and some of the adjacent peptides may represent single T cell epitopes.

MHC Restriction of the Lymphoproliferative Responses (See FIG. 3 and Table 2)

HLA restriction of the T cell response was first studied by dose-dependant inhibition with MAb to HLA class I and II antigen (FIG. 2). The lymphoproliferative response was inhibited by 50% with μg of MAb to HLA class II (L243) and 10 μg of the MAb inhibited 100% of the responses (from SI 10.0±3.2 to SI 1.5±0.4). Neither MAb to HLA class I (W6/32) nor the isotype control induced any inhibition of the lymphoproliferative response.

The HLA-DR of 17 subjects were determined and 6 of these were homozygous. The responses of the immunodominant and minor epitopes were then studied in the 6 DR homologous subjects (Table 2). Only peptide 975-994 appeared to be restricted by HLA-DR1. The other 6 peptides stimulated lymphocytes from HLA-DR1, 2 (except AA 1085-1104) and DR6 (except AA 803-822). DR5 was restricted by peptide 803-922, though the latter stimulated lymphocytes with DR1, 2 and 3 antigens. Lymphocytes with DR3 or 4 antigen responded to 3 or 4 peptides. The results suggest that except for peptide 975-994, the remaining 6 peptides appear to be promiscuous as they stimulated lymphocytes with 3 to 5 HLA-DR antigens.

                                      TABLE 1                                      __________________________________________________________________________     DR                                                                               803-822                                                                              975-994                                                                              985-1004                                                                             1005-1024                                                                            1014-1034                                                                            1085-1104                                                                            1115-1134                                __________________________________________________________________________     1 4.1 ± 1.0                                                                         4.0 ± 1.3                                                                         5.8 ± 1.8                                                                         3.2 ± 0.6                                                                         3.3 ± 1.1                                                                         3.3 ± 1.3                                                                         3.2 ± 0.6                             2 19.3 ± 6.6                                                                        2.2 ± 0.4                                                                         16.7 ± 1.7                                                                        14.6 ± 5.7                                                                        11.2 ± 5.2                                                                        0.6 ± 0.3                                                                         14.7 ± 3.3                            3 6.1 ± 2.7                                                                         0.7 ± 0.2                                                                         4.1 ± 2.3                                                                         1.0 ± 0.2                                                                         2.1 ± 1.7                                                                         4.3 ± 1.2                                                                         1.9 ± 2.3                             4 2.5 ± 0.8                                                                         1.8 ± 0.7                                                                         3.0 ± 0.3                                                                         3.2 ± 0.5                                                                         1.6 ± 0.1                                                                         3.7 ± 0.6                                                                         1.5 ± 0.7                             5 6.8 ± 1.0                                                                         1.8 ± 1.3                                                                         2.0 ± 0.8                                                                         2.3 ± 0.5                                                                         1.3 ± 0.3                                                                         1.2 ± 0.4                                                                         2.9 ± 2.8                             6 2.6 ± 1.5                                                                         2.9 ± 0.9                                                                         3.5 ± 0.5                                                                         8.3 ± 3.1                                                                         5.7 ± 2.4                                                                         5.6 ± 1.4                                                                         5.0 ± 2.0                             __________________________________________________________________________      The relationship between HLADR1-6 and the T cell responses to 7 synthetic      peptides.                                                                      S.I (±sem) values of subjects homozygous for DR are shown.                  Positive responses (S.I. > 3.0, c.p.m. > 500) are in bold.               

B Cell Epitope Mapping (see FIG. 4)

Recognition of the recombinant fragments was assessed by Western blotting. Representative blots obtained with sera from 3 individuals are shown it FIG. 3 together with a positive control using rabbit anti-SA I/II antiserum. In panel a, SA I/II, and 984-1161 were recognised strongly. Rabbit anti-SA I/II antiserum used as a positive control (panel d) recognised recombinant 984-1161. The recombinant polypeptide corresponding to residues 984-1161 was also analysed. SA I/II was recognised by all subjects. B cell epitopes were mapped by ELISA using the panel of synthetic peptides. The panel of peptides was screened with sera from 22 individuals and 8 peptides which were recognised by more than one individual, together with one peptide which was not recognised, were selected for further analyses (FIG. 5). SA I/II was recognised by all subjects with mean log₂ titre of 7.6±1.2. Titres against peptides were lower, with only that against peptide 824-843 (mean log₂ titre 4.7±1.1) being significantly greater than the titre against the control SIV p27 peptide (t=7.28 p<0.01). The proportion of significant titres (>mean+2 standard derivations) was also calculated (FIG. 5) and only peptide 824-843 showed high frequency (18/22). Indeed, an immunodominant B cell epitope is present within peptide 824-843, possibly shared with the overlapping peptide 834-353, while peptides 925-944, 1035-1054 and 1085-1104 constitute minor B cell epitopes. Despite the high frequency of responses to the recombinant polypeptide 984-1161 described above), a very low frequency of responses was observed to peptides within this region.

Saliva samples from the subjects were cultured to determine levels of S. mutans. In 66% of individuals S. mutans was detected (range 10³ -10⁵ colony forming units/ml). There was no correlation between S. mutans levels and recognition of particular epitopes or titre against SA I/II.

Adhesion Epitope Mapping

Adherence of S. mutans to saliva-coated microtitre wells (a model of the tooth surface) was determined with [³ H]-thymidine labelled S. mutans. The proportion of adhering bacteria was in the range 1-5%. In the absence of saliva, the proportion of adhering bacteria was <0.1%.

In a series of competitive inhibition assays, the panel of synthetic peptides was assayed for inhibition of adhesion of S. mutans to saliva-coated microtitre wells. Peptides 1005-1024, 1025-1044 and 1085-1104 consistently inhibited adhesion with maximal inhibition ≧90% at concentrations of 500 μM (FIG. 6). Adjacent peptides 1015-1034 and 1095-1114 showed more variable and lower inhibition, and may be part of the adhesion epitopes.

Example 2 Construction of an Expression Vector and Expression of a Recombinant Polypeptide of the Invention (SEQ. ID. No. 8)

Using the oligonucleotide primers TAT CAT ATG CAA GAT CTT CCA ACA CCT CCA TCT ATA (5') (SEQ. ID. NO. 29) and GTC GAC TCA TAC CAA GAC AAA GGA AGT TGT (3') (SEQ. ID. No.30) the portion of the SA I/II gene encoding residues 975-1044 (SEQ. ID. No.8) was amplified by polymerase chain reaction. The amplified gene fragment (with introduced Nde I and Sal I restriction enzyme sites) was cloned using the Ta cloning system and was subcloned into the plasmid pET15b. The recombinant polypeptide was expressed in E. coli BL21 (DE3).

Example 3 Stimulation of an in vitro T-cell Response by the Recombinant Polypeptide (SEQ. ID. No. 8)

Peripheral blood lymphocytes from human volunteers were prepared as described above. Cells were incubated with purified recombinant polypeptide 975-1044 at concentrations of 40, 10 and 1 μg/ml. Cells were also incubated with a protein fraction prepared in the same way from E. coli harbouring non-recombinant plasmid. Proliferative responses of 17 subjects were determined. Mean stimulation index (±sem) was 11.6±2.3 compared with 2.4±0.3 for the control. The frequency of subjects responding (i.e. those with stimulation index≧control+2SD) was 15/17.

Example 4 Immunisation of Mice with the Recombinant Polypeptide (SEQ. ID. NO. 8) (See FIG. 7)

i) Groups of mice (3-4 per group) were immunised with 975-1044 (SEQ. ID. No. 8) by two routes:

a)intraperitoneally with 50 μg polypeptide in incomplete Freund's adjuvant with a boost after 4 weeks (also 50 μg in incomplete Freund's adjuvant and intraperitoneally).

b) subcutaneously. A single immunisation with 50 μg polypeptide in incomplete Freund's adjuvant.

ii) Draining lymph nodes were removed 10 to 14 days after immunisation, pooled and homogenised to give a single cell suspension in RPMI 1640 culture medium supplemented with 2 mM glutamine, 1 mM pyruvate, 50 mM 2-mercaptoethanol, 100 u/ml penicillin, 100 μg/ml streptomycin, 100 mM HEPES and 5% foetal calf serum. Cells (2×10⁵ /well) were cultured with antigen and proliferation was measured by incorporation of [³ H]-thymidine as described above. Antigens were SA I/II recombinant polypeptides, peptides spanning residues 975-1044 and a control protein fraction from E. coli harbouring non-recombinant plasmid.

As in FIG. 7, all mouse strains responded to SA I/II and the recombinant polypeptide 975-1044 (SEQ. ID. No. 8). Positive responses to peptides were those of stimulation index ≧3.0 (cpm>500). SJL mice responded to peptide 985-1004 and DBA/a mice responded to peptide 975-995 and 985-1004. For BALB/c mice, no significant responses to peptides were observed although the response to peptide 985-1004 was greater than responses to the remaining peptides.

iii) Antibody Recognition (See Table 2)

Sera from mice immunised intraperitoneally with polypeptide 975-1044 recognised intact cells of S.mutans, intact SA I/II and recombinant 975-1044. Peptides 995-1014 and 1025-1044 were also recognised. The titre for each strain was as in Table 2, which shows log₂ titres where initial dilution was 1 in 50 (titre=1).

                                      TABLE 2                                      __________________________________________________________________________     Antibody recognition of S. mutans, SA I/II and peptides.                       ANTIGEN              PEPTIDES                                                                975-                                                                              NR  975-                                                                              985-                                                                              995-                                                                              995-                                                                              1015-                                                                              1025-                                     STRAIN                                                                              S. mutans                                                                           SA I/II                                                                            1044                                                                              Control                                                                            994                                                                               1004                                                                              1014                                                                              1014                                                                              1034                                                                               1044                                      __________________________________________________________________________     SJL  4.0  4.0 10.3                                                                              --  -- -- 8.7                                                                               1.0                                                                               --  5.7                                       DBA/1                                                                               3.0  2.7 10.7                                                                              --  0.7                                                                               0.7                                                                               4.7                                                                               .2.0                                                                              --  7.3                                       BALB/C                                                                              2.0  2.8 10.7                                                                              --  -- -- 5.7                                                                               3.0                                                                               --  4.7                                       __________________________________________________________________________      Numbers in the table are log.sub.2 titres (1 = 1:50)                     

Example 5 Analysis of the Interaction Between Streptococcal Antigen I/II and Salivary Receptor Using BIAcore

Aims

In this study, we have used surface plasmon resonance (spr) to analyse the interaction between purified SA I/II and whole human saliva or purified salivary receptor. In addition we have investigated the calcium dependence of binding, identified individual amino acid residues which may be involved in binding and determined the affinity of the interaction between SA I/II and salivary receptor.

Methods

Materials

SA I/II and recombinant polypeptides were prepared as described above. Salivary receptor was prepared by absorption of whole saliva with intact cells of S. mutans (Lee et al (1989) Infect. Immun. 57:3306-3313). The cells were washed with KPBS (2.7 mM KCl, 137 mM NaCl in 1.5 mM KH₂ PO₄, 6.5 mM Na₂ HPO₄, pH 7.2)and adsorbed material was eluted with 1 mM EDTA in KPBS. Analysis of the purified material by polyacrylamide gel electrophoresis in the presence of Na dodecyl sulphate indicated the presence of components of Mr>200,000 and approximately 40,000. Peptides were prepared by the simultaneous multiple peptide synthesis procedure (Houghten (1985) Proc. Natl. Acad. Sci. USA 82:5131-5135) as above. In addition, a series of peptides was synthesised corresponding to residues 1025-1044 in which each residue in turn was substituted by alanine.

Binding Analyses

Purified SA I/II or salivary receptor was immobilised on the sensor chip surface at a concentration of 100 μg/ml in 10 mM Na formate pH 3.5 using the amine coupling kit (Pharmacia Biosensor).

i. Inhibition Studies

Binding of immobilised SA I/II to receptors in whole saliva was determined in the absence and presence of inhibitors (at varying concentrations). Inhibition by alanine-substituted peptides was analysed at a peptide concentration of 50 μM. The running buffer was HEPES buffered saline (HBS) and the surface was regenerated with 100 mM HCl.

ii. Direct Binding

Purified salivary receptor was immobilised on the sensor chip and binding of SA I/II or purified recombinant polypeptide fragments was determined.

Results

i. Calcium Dependency

In separate determinations with whole saliva, binding to immobilised SA I/II varied from approximately 250 resonance units (RU)--800 RU. In the presence of EDTA, binding was inhibited with maximal inhibition of 95% at a concentration of 10 mM EDTA. Subsequent binding assays were performed in the presence of 5 mM calcium.

ii. Inhibition of Binding

Purified SA I/II or recombinant polypeptide fragments 1 (residues 39-481), 2 (residues 475-824), 3 (residues 816-1213), 4 (residues 1155-1538) and recombinant 984-1161 were added to fluid phase saliva as competitive inhibitors at concentrations varying from 0-20 μM. SA I/II inhibited binding most efficiently with approximately 90% inhibition at a concentration of 6 μM (FIG. 8). Of the recombinant fragments, only fragment 3 and r984-1161 inhibited binding to salivary receptors to a significantly greater extent than the control (bovine serum albumin) with maximal inhibition of 65% and 50%, respectively (FIG. 8).

A panel of synthetic peptides (20 mers overlapping by 10) spanning residues 803-1174 was assayed for inhibitory activity. Peptide 1025-1044 was the most effective inhibitor although 10-20 fold higher concentrations were required than for polypeptides (FIG. 9). A panel of peptides in which each of the residues 1025-1044 in turn were substituted with alanine (alanine was substituted by serine where it occurred naturally) was also analysed for inhibitory activity. Substitution of Glu (1037) consistently abolished inhibition mediated by the peptide (FIG. 10). Similarly, substition of Gln 1025, Thr 1039, Phe 1041, Val 1042, Leu 1043 and Val 1044 reduced the inhibition of binding which was mediated by the peptide 1025-1044.

iii. Direct Binding

For these analyses, purified salivary receptor was immobilised on the sensor chip and binding to fluid phase SA I/II or recombinant polypeptides was determined. At a concentration of approximately 5 μM both SA I/II and recombinant SA I/II bound to salivary receptor in the range 500-600 RU (FIG. 11). Binding of recombinant polypeptides was determined at a concentration of approximately 20 μM and highest binding was obtained with fragment 3 (1256 RU) (FIG. 11). Binding of other fragments although significantly greater than the myosin control was not greater than the bovine serum albumin control and thus does not appear to be specific. Addition of EDTA (10 μM) in this assay completely inhibited binding of fluid phase SA I/II.

Affinity and rate constants for the adhesin-receptor interaction were determined for SA I/II, recombinant SA I/II and fragment 3 (Table 3). The values indicate a low affinity interaction with a slow association rate constant and a relatively rapid dissociation constant.

Conclusions

These analyses confirm that residues 816-1213 of SA I/II form an adhesion binding region and that within this region, peptide 1025-1044 forms an adhesion epitope. We have now extended these findings by identifying specific residues which may be essential for binding to salivary receptor, namely residues 1025, 1037, 1039 and 1041-1045. The binding is EDTA sensitive and, under the assay conditions, is of relatively low affinity.

                  TABLE 3                                                          ______________________________________                                                  SA I/II  recomb. SA I/II                                                                           FRAG 3                                            ______________________________________                                         k.sub.a                                                                              (M.sup.-1 s.sup.-1)                                                                     n.d.       20.9 × 10.sup.3                                                                   1.5 × 10.sup.3                        k.sub.d                                                                              (s.sup.-1)                                                                              2 × 10.sup.-2                                                                        .sup.  4.2 × 10.sup.-3                                                           .sup.  8.1 × 10.sup.-3                K.sub.A                                                                              (M.sup.-1)                                                                              n.d.        5.0 × 10.sup.6                                                                   0.2 × 10.sup.6                        ______________________________________                                          n.d. not determined                                                      

    __________________________________________________________________________     SEQUENCE INFORMATION                                                           __________________________________________________________________________     As a result of the experiments detailed above, the                             following sequences have been identified as being of                           particular interest.                                                           (i)    Residues 925 to 1114 (SEQ. ID. No. 1). This sequence                           comprises sequences (iv) and (v) below and includes 2                          series of overlapping T-cell, B-cell and adhesion                              epitopes, a further B-cell epitope, a further T-cell                           epitope and an adhesion site.                                           SEQ. ID. No. 1:                                                                TEKPLEPAPVEPSYEAEPTPPTPTPDQPEPNKPVEPTYEVIPTPPTDPVYQDLPTPPSI                    PTVHFHYFKLAVQPQVNKEIRNNNDVNIDRTLVAKQSVVKFQLKTADLPAGRDETTSFV                    LVDPLPSGYQFNPEATKAASPGFDVAYDNATNTVTFKATAATLATFNADLTKSVATIYP                    TVVGQVLNDGATY                                                                  Its DNA sequence is (SEQ. ID. No. 12):                                         ACAGAAAAGCCGTTGGAGCCAGCACCTGTTGAGCCAAGCTATGAAGCAGAGCCAACGCCA                   CCGACACCAACACCAGATCAACCAGAACCAAACAAACCTGTTGAGCCAACTTATGAGGTT                   ATTCCAACACCGCCGACTGATCCTGTTTATCAAGATCTTCCAACACCTCCATCTATACCA                   ACTGTTCATTTCCATTACTTTAAACTAGCTGTTCAGCCGCAGGTTAACAAAGAAATTAGA                   AACAATAACGATGTTAATATTGACAGAACTTTGGTGGCTAAACAATCTGTTGTTAAGTTC                   CAGCTGAAGACAGCAGATCTCCCTGCTGGACGTGATGAAACAACTTCCTTTGTCTTGGTA                   GATCCCCTGCCATCTGGTTATCAATTTAATCCTGAAGCTACAAAAGCTGCCAGCCCTGGC                   TTTGATGTCGCTTATGATAATGCAACTAATACAGTCACCTTCAAGGCAACTGCAGCAACT                   TTGGCTACGTTTAATGCTGATTTGACTAAGTCAGTGGCAACGATTTATCCAACAGTGGTC                   GGACAAGTTCTTAATGATGGCGCAACTTAT                                                 (ii)   Residues 1005 to 1044 (SEQ. ID. No. 2). This comprises                         a T-cell epitope overlapping two adhesion sites.                        SEQ. ID. No. 2                                                                 NNNDVNIDRTLVAKQSVVKFQLKTADLPAGRDETTSFVLV                                       Its DNA sequence is (SEQ. ID. No. 13)                                          AACAATAACGATGTTAATATTGACAGAACTTTGGTGGCTAAACAATCTGTTGTTAAGTTC                   CAGCTGAAGACAGCAGATCTCCCTGCTGGACGTGATGAAACAACTTCCTTTGTCTTGGTA                   (iii)   Residues 1085-1104 (SEQ. ID. No. 3). Here, a T-cell                            epitope, a B-cell epitope and an adhesion site                                 overlap.                                                               SEQ. ID. No. 3:                                                                LATFNADLTKSVATIYPTVV                                                           Its DNA sequence is (SEQ. ID. No. 14)                                          TTGGCTACGTTTAATGCTGATTTGACTAAGTCAGTGGCAACGATTTATCCAACAGTGGTC                   (iv)   Residues 1005 to 1114 (SEQ. ID. No. 4). This comprises                         sequences (ii) and (iii) above and therefore includes                          two sequences in which a B-cell epitope a T-cell                               epitopes and an adhesion site overlap.                                  SEQ. ID. No. 4                                                                 NNNDVNIDRTLVAKQSVVKFQLKTADLPAGRDETTSFVLVDPLPSGYQFNPEATKAASPGF                  DVAYDNATNTVTFKATAATLATFNADLTKSVATIYPTVVGQVLNDGATY                              Its DNA sequence is (SEQ. ID. No. 15)                                          AACAATAACGATGTTAATATTGACAGAACTTTGGTGGCTAAACAATCTGTTGTTAAGTTC                   CAGCTGAAGACAGCAGATCTCCCTGCTGGACGTGATGAAACAACTTCCTTTGTCTTGGTA                   GATCCCCTGCCATCTGGTTATCAATTTAATCCTGAAGCTACAAAAGCTGCCAGCCCTGGC                   TTTGATGTCGCTTATGATAATGCAACTAATACAGTCACCTTCAAGGCAACTGCAGCAACT                   TTGGCTACGTTTAATGCTGATTTGACTAAGTCAGTGGCAACGATTTATCCAACAGTGGTC                   GGACAAGTTCTTAATGATGGCGCAACTTAT                                                 (v)    Residues 925 to 1004 (SEQ. ID. No. 5). This comprises                          a B-cell epitope, an immunodominant T-cell epitope and                         an adhesion site.                                                       SEQ. ID. No. 5:                                                                TEKPLEPAPVEPSYEAEPTPPTPTPDQPEPNKPVEPTYEVIPTPPTDPVYQDLPTPPSIPT                  VHFHYFKLAVQPQVNKEIR                                                            Its DNA sequence is (SEQ. ID. No. 16):                                         ACAGAAAAGCCGTTGGAGCCAGCACCTGTTGAGCCAAGCTATGAAGCAGAGCCAACGCCA                   CCGACACCAACACCAGATCAACCAGAACCAAACAAACCTGTTGAGCCAACTTATGAGGTT                   ATTCCAACACCGCCGACTGATCCTGTTTATCAAGATCTTCCAACACCTCCATCTATACCA                   ACTGTTCATTTCCATTACTTTAAACTAGCTGTTCAGCCGCAGGTTAACAAAGAAATTAGA                   (vi)   Residues 925 to 1054 (SEQ. ID. No. 6). This comprises                          sequence (v) above, together with a further adjacent                           adhesion site and a further overlapping B-cell epitope.                 SEQ. ID. No. 6                                                                 TEKPLEPAPVEPSYEAEPTPPTPTPDQPEPNKPVEPTYEVIPTPPTDPVYQDLPTPPSIPT                  VHFHYFKLAVQPQVNKEIRNNNDVNIDRTLVAKQSVVKFQLKTADLPAGRDETTSFVLVDP                  LPSGYQFN                                                                       Its DNA sequence is (SEQ. ID. No. 17):                                         ACAGAAAAGCCGTTGGAGCCAGCACCTGTTGAGCCAAGCTATGAAGCAGAGCCAACGCCA                   CCGACACCAACACCAGATCAACCAGAACCAAACAAACCTGTTGAGCCAACTTATGAGGTT                   ATTCCAACACCGCCGACTGATCCTGTTTATCAAGATCTTCCAACACCTCCATCTATACCA                   ACTGTTCATTTCCATTACTTTAAACTAGCTGTTCAGCCGCAGGTTAACAAAGAAATTAGA                   AACAATAACGATGTTAATATTGACAGAACTTTGGTGGCTAAACAATCTGTTGTTAAGTTC                   CAGCTGAAGACAGCAGATCTCCCTGCTGGACGTGATGAAACAACTTCCTTTGTCTTGGTA                   GATCCCCTGCCATCTGGTTATCAATTTAAT                                                 (vii)   Residues 803-854 (SEQ. ID. No. 7). This comprises a                           major T-cell epitope and adjacent immunodominant B-                            cell epitope.                                                           SEQ. ID. No. 7:                                                                ETGKKPNIWYSLNGKIRAVNLPKVTKEKPTPPVKPTAPTKPTYETEKPLKPA                           Its DNA sequence is (SEQ. ID. No. 18)                                          GAAACCGGCAAAAAACCAAATATTTGGTATTCATTAAATGGTAAAATCCGTGCGGTTAAT                   CTTCCTAAAGTTACTAAGGAAAAACCCACACCTCCGGTTAAACCAACAGCTCCAACTAAA                   CCAACTTATGAAACAGAAAAGCCATTAAAACCGGCA                                           (viii) Residues 975 to 1044 (SEQ. ID. No. 8). This                                    comprises a T-cell epitope, a B-cell epitope and an                            adhesion site.                                                          SEQ. ID No. 8:                                                                 QDLPTPPSIPTVHFHYFKLAVQPQVNKEIRNNNDVNIDRTLVAKQSVVKFQLKTADLPAGR                  DETTSFVLV                                                                      Its DNA sequence is (SEQ. ID. No. 19):                                         CAAGATCTTCCAACACCTCCATCTATACCAACTGTTCATTTCCATTACTTTAAACTAGCT                   GTTCAGCCGCAGGTTAACAAAGAAATTAGAAACAATAACGATGTTAATATTGACAGAACT                   TTGGTGGCTAAACAATCTGTTGTTAAGTTCCAGCTGAAGACAGCAGATCTCCCTGCTGGA                   CGTGATGAAACAACTTCCTTTGTCTTGGTA                                                 (ix)   Residues 1024 to 1044 (SEQ. ID. No. 9). This comprises                         a T-cell epitope overlapping with an adhesion site.                            SEQ. ID. No. 9                                                          FQLKTADLPAGRDETTSFVLV                                                          Its DNA Sequence is (SEQ. ID. No. 20):                                         TTCCAGCTGAAGACAGCAGATCTCCCTGCTGGACGTGATGAAACAACTTCCTTTGTCTTG                   GTA                                                                            (x)    Residues 803 to 1114 (SEQ. ID. No. 10). This comprises                         sequences (i) and (vii) above and some intervening                             sequence. Residues 803 to 1114 comprise 2 series of                            overlapping T-cell, B-cell and adhesion epitopes, a                            further T-cell epitope and a further adhesion site and                         an immunodominant B-cell epitope and a major T-cell                            epitope.                                                                SEQ. ID. No. 10:                                                               ETGKKPNIWYSLNGKIRAVNLPKVTKEKPTPPVKPTAPTKPTYETEKPLKPAPV                         APNYEKEPTPPTRTPDQAEPKKPTPPTYETEKPLEPAPVEPSYEAEPTPPTRTPDQAE                     PNKPTPPTYETEKPLEPAPVEPSYEAEPTPPTPTPDQPEPNKPVEPTYEVIPTPPTDP                     VYQDLPTPPSIPTVHFHYFKLAVQPQVNKEIRNNNDVNIDRTLVAKQSVVKFQLKTAD                     LPAGRDETTSFVLVDPLPSGYQFNPEATKAASPGFDVAYDNATNTVTFKATAATLATF                     NADLTKSVATIYPTVVGQVLNDGATY                                                     Its DNA Sequence is (SEQ. ID. NO. 21):                                         GAAACCGGCAAAAAACCAAATATTTGGTATTCATTAAATGGTAAAATCCGTGCGGTTAAT                   CTTCCTAAAGTTACTAAGGAAAAACCCACACCTCCGGTTAAACCAACAGCTCCAACTAAA                   CCAACTTATGAAACAGAAAAGCCATTAAAACCGGCACCAGTAGCTCCAAATTATGAAAAG                   GAGCCAACACCACCGACAAGAACACCGGATCAAGCAGAGCCAAAGAAACCCACTCCGCCG                   ACCTATGAAACAGAAAAGCCGTTGGAGCCAGCACCTGTTGAGCCAAGCTATGAAGCAGAG                   CCAACACCGCCGACAAGGACACCGGATCAGGCAGAGCCAAATAAACCCACACCGCCGACC                   TATGAAACAGAAAAGCCGTTGGAGCCAGCACCTGTTGAGCCAAGCTATGAAGCAGAGCCA                   ACGCCACCGACACCAACACCAGATCAACCAGAACCAAACAAACCTGTTGAGCCAACTTAT                   GAGGTTATTCCAACACCGCCGACTGATCCTGTTTATCAAGATCTTCCAACACCTCCATCT                   ATACCAACTGTTCATTTCCATTACTTTAAACTAGCTGTTCAGCCGCAGGTTAACAAAGAA                   ATTAGAAACAATAACGATGTTAATATTGACAGAACTTTGGTGGCTAAACAATCTGTTGTT                   AAGTTCCAGCTGAAGACAGCAGATCTCCCTGCTGGACGTGATGAAACAACTTCCTTTGTC                   TTGGTAGATCCCCTGCCATCTGGTTATCAATTTAATCCTGAAGCTACAAAAGCTGCCAGC                   CCTGGCTTTGATGTCGCTTATGATAATGCAACTAATACAGTCACCTTCAAGGCAACTGCA                   GCAACTTTGGCTACGTTTAATGCTGATTTGACTAAGTCAGTGGCAACGATTTATCCAACA                   GTGGTCGGACAAGTTCTTAATGATGGCGCAACTTAT                                           (xi)   Residues 975 to 1004 (SEQ. ID. No. 11), which comprise                         a T-cell epitope.                                                       SEQ. ID. No. 11:                                                               QDLPTPPSIPTVHFHYFKLAVQPQVNKEIR                                                 Its DNA Sequence is (SEQ. ID. NO. 22):                                         CAAGATCTTCCAACACCTCCATCTATACCAACTGTTCATTTCCATTACTTTAAACTAGCT                   GTTCAGCCGCAGGTTAACAAAGAAATTAGA                                                 The amino acid sequence of SA I/II is as follows, beginning                    with residue No. 1 (SEQ ID No. 23).                                            MKVKKTYGFRKSKISKTLCGAVLGTVAAVSVAGQKVFADETTTT                                   SDVDTKVVGTQTGNPATNLPEAQGSASKQAEQSQTKLERQMVHTIEVPKTDLDQAAKD                     AKSAGVNVVQDADVNKGTVKTAEEAVQKETEIKEDYTKQAEDIKKTTDQYKSDVAAHE                     AEVAKIKAKNQATKEQYGKDMVAHKAEVERINAANAASKTAYEAKLAQYQADLAAVQK                     TNAANQASYQKALAAYQAELKRVQEANAAAKAAYDTAVAANNAKNTEIAAANEEIRKR                     NATAKAEYETKLAQYQAELKRVQEANAANEADYQAKLTAYQTELARVQKANADAKAAY                     EAAVAANNAKNAALTAENTAIKQRNENAKATYEAALKQYEADLAAVKKANAANEADYQ                     AKLTAYQTELARVQKANADAKAAYEAAVAANNAANAALTAENTAIKKRNADAKADYEA                     KLAKYQADLAKYQKDLADYPVKLKAYEDEQASIKAALAELEKHKNEDGNLTEPSAQNL                     VYDLEPNANLSLTTDGKFLKASAVDDAFSKSTSKAKYDQKILQLDDLDITNLEQSNDV                     ASSMELYGNFGDKAGWSTTVSNNSQVKWGSVLLERGQSATATYTNLQNSYYNGKKISK                     IVYKYTVDPKSKFQGQKVWLGIFTDPTLGVFASAYTGQVEKNTSIFIKNEFTFYDEDG                     KPINFDNALLSVASLNRENNSIEMAKDYTGKFVKISGSSIGEKNGMIYATDTLNFRQG                     QGGARWTMYTRASEPGSGWDSSDAPNSWYGAGAIRMSGPNNSVTLGAISSTLVVPADP                     TMAIETGKKPNIWYSLNGKIRAVNLPKVTKEKPTPPVKPTAPTKPTYETEKPLKPAPV                     APNYEKEPTPPTRTPDQAEPKKPTPPTYETEKPLEPAPVEPSYEAEPTPPTRTPDQAE                     PNKPTPPTYETEKPLEPAPVEPSYEAEPTPPTPTPDQPEPNKPVEPTYEVIPTPPTDP                     VYQDLPTPPSIPTVHFHYFKLAVQPQVNKEIRNNNDVNIDRTLVAKQSVVKFQLKTAD                     LPAGRDETTSFVLVDPLPSGYQFNPEATKAASPGFDVAYDNATNTVTFKATAATLATF                     NADLTKSVATIYPTVVGQVLNDGATYKNNFSLTVNDAYGIKSNVVRVTTPGKPNDPDN                     PNNNYIKPTKVNKNENGVVIDGKTVLAGSTNYYELTWDLDQYKNDRSSADTIQQGFYY                     VDDYPEEALELRQDLVKITDANGNEVTGVSVDNYTSLEAAPQEIRDVLSKAGIRPKGA                     FQIFRADNPREFYDTYVKTGIDLKIVSPMVVKKQMGQTGGSYEDQAYQIDFGNGYASN                     IVINNVPKINPKKDVTLTLDPADTNNVDGQTIPLNTVFNYRLIGGIIPANHSEELFEY                     NFYDDYDQTGDHYTGQYKVFAKVDITLKNGVIIKSGTELTQYTTAEVDTTKGAITIKF                     KEAFLRSVSIDSAFQAESYIQMKRIAVGTFENTYINTVNGVTYSSNTVKTTTPEDPAD                     PTDPQDPSSPRTSTVIIYKPQSTAYQPSSVQKTLPNTGVTNNAYMPLLGIIGLVTSFSL                    LGLKAKKD                                                                       Its DNA sequence is as follows (SEQ ID No. 24):                                   1 ATTTCAGCAA AAATTGACAA ATCAAATCAA TTATATTACA ATTTTTTAAC                      51 GTATATTACA AAAATATATT TGGAAGATTT ATTCAGATTT GGAGGATTTA                     101 TGAAAGTCAA AAAAACTTAC GGTTTTCGTA AAAGTAAAAT TAGTAAAACA                     151 CTGTGTGGTG CTGTTCTAGG AACAGTAGCA GCAGTCTCTG TAGCAGGACA                     201 AAAGGTTTTT GCCGATGAAA CGACCACTAC TAGTGATGTA GATACTAAAG                     251 TAGTTGGAAC ACAAACTGGA AATCCAGCGA CCAATTTGCC AGAGGCTCAA                     301 GGAAGTGCGA GTAAGCAAGC TGAACAAAGT CAAACCAAGC TGGAGAGACA                     351 AATGGTTCAT ACCATTGAAG TACCTAAAAC TGATCTTGAT CAAGCAGCAA                     401 AAGATGCTAA GTCTGCTGGT GTCAATGTTG TCCAAGATGC CGATGTTAAT                     451 AAAGGAACTG TTAAAACAGC TGAAGAAGCA GTCCAAAAAG AAACTGAAAT                     501 TAAAGAAGAT TACACAAAAC AAGCTGAGGA TATTAAGAAG ACAACAGATC                     551 AATATAAATC GGATGTAGCT GCTCATGAGG CAGAAGTTGC TAAAATCAAA                     601 GCTAAAAATC AGGCAACTAA AGAACAGTAT GGAAAAGATA TGGTAGCTCA                     651 TAAAGCCGAG GTTGAACGCA TTAATGCTGC AAATGCTGCC AGTAAAACAG                     701 CTTATGAAGC TAAATTGGCT CAATATCAAG CAGATTTAGC AGCCGTTCAA                     751 AAAACCAATG CTGCCAATCA AGCATCCTAT CAAAAAGCCC TTGCTGCTTA                     801 TCAGGCTGAA CTGAAACGTG TTCAGGAAGC TAATGCAGCC GCCAAAGCCG                     851 CTTATGATAC TGCTGTAGCA GCAAATAATG CCAAAAATAC AGAAATTGCC                     901 GCTGCCAATG AAGAAATTAG AAAACGCAAT GCAACGGCCA AAGCTGAATA                     951 TGAGACTAAG TTAGCTCAAT ATCAAGCTGA ACTAAAGCGT GTTCAGGAAG                    1001 CTAATGCCGC AAACGAAGCA GACTATCAAG CTAAATTGAC CGCCTATCAA                    1051 ACAGAGCTTG CTCGCGTTCA GAAAGCCAAT GCAGATGCTA AAGCGGCCTA                    1101 TGAAGCAGCT GTAGCAGCAA ATAATGCCAA AAATGCGGCA CTTACAGCTG                    1151 AAAATACTGC AATTAAGCAA CGCAATGAGA ATGCTAAGGC GACTTATGAA                    1201 GCTGCACTCA AGCAATATGA GGCTGATTTG GCAGCGGTGA AAAAAGCTAA                    1251 TGCCGCAAAC GAAGCAGACT ATCAAGCTAA ATTGACCGCC TATCAAACAG                    1301 AGCTCGCTCG CGTTCAAAAG GCCAATGCGG ATGCTAAAGC GGCCTATGAA                    1351 GCAGCTGTAG CAGCAAATAA TGCCGCAAAT GCAGCGCTCA CAGCTGAAAA                    1401 TACTGCAATT AAGAAGCGCA ATGCGGATGC TAAAGCTGAT TACGAAGCAA                    1451 AACTTGCTAA GTATCAAGCA GATCTTGCCA AATATCAAAA AGATTTAGCA                    1501 GACTATCCAG TTAAGTTAAA GGCATACGAA GATGAACAAG CTTCTATTAA                    1551 AGCTGCACTG GCAGAACTTG AAAAACATAA AAATGAAGAC GGAAACTTAA                    1601 CAGAACCATC TGCTCAAAAT TTGGTCTATG ATCTTGAGCC AAATGCGAAC                    1651 TTATCTTTGA CAACAGATGG GAAGTTCCTT AAGGCTTCTG CTGTGGATGA                    1701 TGCTTTTAGC AAAAGCACTT CAAAAGCAAA ATATGACCAA AAAATTCTTC                    1751 AATTAGATGA TCTAGATATC ACTAACTTAG AACAATCTAA TGATGTTGCT                    1801 TCTTCTATGG AGCTTTATGG CAATTTTGGT GATAAAGCTG GCTGGTCAAC                    1851 GACAGTAAGC AATAACTCAC AGGTTAAATG GGGATCGGTA CTTTTAGAGC                    1901 GCGGTCAAAG CGCAACAGCT ACATACACTA ACCTGCAGAA TTCTTATTAC                    2001 GTCCAAGTTT CAAGGTCAAA AGGTTTGGTT AGGTATTTTT ACCGATCCAA                    1951 AATGGTAAAA AGATTTCTAA AATTGTCTAC AAGTATACAG TGGACCCTAA                    2051 CTTTAGGTGT TTTTGCTTCC GCTTATACAG GTCAAGTTGA AAAAAACACT                    2101 TCTATTTTTA TTAAAAATGA ATTCACTTTC TATGACGAAG ATGGAAAACC                    2151 AATTAATTTT GATAATGCCC TTCTATCAGT AGCTTCTCTT AACCGAGAAA                    2201 ATAATTCTAT TGAGATGGCC AAAGATTATA CGGGTAAATT TGTCAAAATC                    2251 TCTGGATCAT CTATCGGTGA AAAGAATGGC ATGATTTATG CTACAGATAC                    2301 TCTCAACTTT AGGCAGGGTC AAGGTGGTGC TCGTTGGACC ATGTATACCA                    2351 GAGCTAGCGA ACCGGGATCT GGCTGGGATA GTTCAGATGC GCCTAACTCT                    2401 TGGTATGGTG CTGGTGCTAT CCGCATGTCT GGTCCTAATA ACAGTGTGAC                    2451 TTTGGGTGCT ATCTCATCAA CACTTGTTGT GCCTGCTGAT CCTACAATGG                    2501 CAATTGAAAC CGGCAAAAAA CCAAATATTT GGTATTCATT AAATGGTAAA                    2551 ATCCGTGCGG TTAATCTTCC TAAAGTTACT AAGGAAAAAC CCACACCTCC                    2601 GGTTAAACCA ACAGCTCCAA CTAAACCAAC TTATGAAACA GAAAAGCCAT                    2651 TAAAACCGGC ACCAGTAGCT CCAAATTATG AAAAGGAGCC AACACCACCG                    2701 ACAAGAACAC CGGATCAAGC AGAGCCAAAG AAACCCACTC CGCCGACCTA                    2751 TGAAACAGAA AAGCCGTTGG AGCCAGCACC TGTTGAGCCA AGCTATGAAG                    2801 CAGAGCCAAC ACCGCCGACA AGGACACCGG ATCAGGCAGA GCCAAATAAA                    2851 CCCACACCGC CGACCTATGA AACAGAAAAG CCGTTGGAGC CAGCACCTGT                    2901 TGAGCCAAGC TATGAAGCAG AGCCAACGCC ACCGACACCA ACACCAGATC                    2951 AACCAGAACC AAACAAACCT GTTGAGCCAA CTTATGAGGT TATTCCAACA                    3001 CCGCCGACTG ATCCTGTTTA TCAAGATCTT CCAACACCTC CATCTATACC                    3051 AACTGTTCAT TTCCATTACT TTAAACTAGC TGTTCAGCCG CAGGTTAACA                    3101 AAGAAATTAG AAACAATAAC GATGTTAATA TTGACAGAAC TTTGGTGGCT                    3151 AAACAATCTG TTGTTAAGTT CCAGCTGAAG ACAGCAGATC TCCCTGCTGG                    3201 ACGTGATGAA ACAACTTCCT TTGTCTTGGT AGATCCCCTG CCATCTGGTT                    3251 ATCAATTTAA TCCTGAAGCT ACAAAAGCTG CCAGCCCTGG CTTTGATGTC                    3301 GCTTATGATA ATGCAACTAA TACAGTCACC TTCAAGGCAA CTGCAGCAAC                    3351 TTTGGCTACG TTTAATGCTG ATTTGACTAA GTCAGTGGCA ACGATTTATC                    3401 CAACAGTGGT CGGACAAGTT CTTAATGATG GCGCAACTTA TAAGAATAAT                    3451 TTCTCGCTCA CAGTCAATGA TGCTTATGGC ATTAAATCCA ATGTTGTTCG                    3501 GGTGACAACT CCTGGTAAAC CAAATGATCC AGATAACCCA AATAATAATT                    3551 ACATTAAGCC AACTAAGGTT AATAAAAATG AAAATGGCGT TGTTATTGAT                    3601 GGTAAAACAG TTCTTGCCGG TTCAACGAAT TATTATGAGC TAACTTGGGA                    3651 TTTGGATCAA TATAAAAACG ACCGCTCTTC AGCAGATACC ATTCAACAAG                    3701 GATTTTACTA TGTAGATGAT TATCCAGAAG AAGCGCTTGA ATTGCGTCAG                    3751 GATTTAGTGA AGATTACAGA TGCTAATGGC AATGAAGTTA CTGGTGTTAG                    3801 TGTGGATAAT TATACTAGTC TTGAAGCAGC CCCTCAAGAA ATTAGAGATG                    3851 TTCTTTCTAA GGCAGGAATT AGACCTAAAG GTGCTTTCCA AATTTTCCGT                    3901 GCCGATAATC CAAGAGAATT TTATGATACT TATGTCAAAA CTGGAATTGA                    3951 TTTGAAGATT GTATCACCAA TGGTTGTTAA AAAACAAATG GGACAAACAG                    4001 GCGGGAGTTA TGAAGATCAA GCTTACCAAA TTGACTTTGG TAATGGTTAT                    4051 GCATCAAATA TCGTTATCAA TAATGTTCCT AAGATTAACC CTAAGAAAGA                    4101 TGTGACCTTA ACACTTGATC CGGCTGATAC AAATAATGTT GATGGTCAGA                    4151 CTATTCCACT TAATACAGTC TTTAATTACC GTTTGATTGG TGGCATTATC                    4201 CCTGCAAATC ACTCAGAAGA ACTCTTTGAA TACAATTTCT ATGATGATTA                    4251 TGATCAAACA GGAGATCACT ATACTGGTCA GTATAAAGTT TTTGCCAAGG                    4301 TTGATATCAC TCTTAAAAAC GGTGTTATTA TCAAGTCAGG TACTGAGTTA                    4351 ACTCAGTATA CGACAGCGGA AGTTGATACC ACTAAAGGTG CTATCACAAT                    4401 TAAGTTCAAG GAAGCCTTTC TGCGTTCTGT TTCAATTGAT TCAGCCTTCC                    4451 AAGCTGAAAG TTATATCCAA ATGAAACGTA TTGCGGTTGG TACTTTTGAA                    4501 AATACCTATA TTAATACTGT CAATGGGGTA ACTTACAGTT CAAATACAGT                    4551 GAAAACAACT ACTCCTGAGG ATCCTGCAGA CCCTACTGAT CCGCAAGATC                    4601 CATCATCACC GCGGACTTCA ACTGTAATTA TCTACAAACC TCAATCAACT                    4651 GCTTATCAAC CAAGCTCTGT CCAAAAAACG TTACCAAATA CGGGAGTAAC                    4701 AAACAATGCT TATATGCCTT TACTTGGTAT TATTGGCTTA CTTACTAGTT                    4751 TTAGTTTGCT TGGCTTAAAG GCTAAGAAAG ATTGACAGCA TAGATATTAC                    4801 ATTAGAATTA AAAAGTGAGA TGAAGCGATA AATCACAGAT TGAGCTTTTA                    4851 TCTCATTTTT TGATT                                                          __________________________________________________________________________

    __________________________________________________________________________     #             SEQUENCE LISTING                                                 - (1) GENERAL INFORMATION:                                                     -    (iii) NUMBER OF SEQUENCES: 27                                             - (2) INFORMATION FOR SEQ ID NO:1:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 190 amino                                                          (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                  #Glu Pro Ser Tyr Glu Alalu Pro Ala Pro Val                                     #                 15                                                           #Gln Pro Glu Pro Asn Lyshr Pro Thr Pro Asp                                     #             30                                                               #Thr Pro Pro Thr Asp Proyr Glu Val Ile Pro                                     #         45                                                                   #Ile Pro Thr Val His Phero Thr Pro Pro Ser                                     #     60                                                                       #Val Asn Lys Glu Ile Argla Val Gln Pro Gln                                     # 80                                                                           #Leu Val Ala Lys Gln Sersn Ile Asp Arg Thr                                     #                 95                                                           #Leu Pro Ala Gly Arg Aspeu Lys Thr Ala Asp                                     #            110                                                               #Leu Pro Ser Gly Tyr Glnal Leu Val Asp Pro                                     #        125                                                                   #Pro Gly Phe Asp Val Alahr Lys Ala Ala Ser                                     #    140                                                                       #Lys Ala Thr Ala Ala Thrsn Thr Val Thr Phe                                     #160                                                                           #Ser Val Ala Thr Ile Tyrla Asp Leu Thr Lys                                     #                175                                                           #Gly Ala Thr Tyral Gly Gln Val Leu Asn Asp                                     #            190                                                               - (2) INFORMATION FOR SEQ ID NO:2:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 40 amino                                                           (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                  #Leu Val Ala Lys Gln Sersn Ile Asp Arg Thr                                     #                 15                                                           #Leu Pro Ala Gly Arg Aspeu Lys Thr Ala Asp                                     #             30                                                               -  Glu Thr Thr Ser Phe Val Leu Val                                             #         40                                                                   - (2) INFORMATION FOR SEQ ID NO:3:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 20 amino                                                           (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:                                  #Ser Val Ala Thr Ile Tyrla Asp Leu Thr Lys                                     #                 15                                                           -  Pro Thr Val Val                                                                          20                                                                - (2) INFORMATION FOR SEQ ID NO:4:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 110 amino                                                          (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                  #Leu Val Ala Lys Gln Sersn Ile Asp Arg Thr                                     #                 15                                                           #Leu Pro Ala Gly Arg Aspeu Lys Thr Ala Asp                                     #             30                                                               #Leu Pro Ser Gly Tyr Glnal Leu Val Asp Pro                                     #         45                                                                   #Pro Gly Phe Asp Val Alahr Lys Ala Ala Ser                                     #     60                                                                       #Lys Ala Thr Ala Ala Thrsn Thr Val Thr Phe                                     # 80                                                                           #Ser Val Ala Thr Ile Tyrla Asp Leu Thr Lys                                     #                 95                                                           #Gly Ala Thr Tyral Gly Gln Val Leu Asn Asp                                     #            110                                                               - (2) INFORMATION FOR SEQ ID NO:5:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 80 amino                                                           (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:                                  #Glu Pro Ser Tyr Glu Alalu Pro Ala Pro Val                                     #                 15                                                           #Gln Pro Glu Pro Asn Lyshr Pro Thr Pro Asp                                     #             30                                                               #Thr Pro Pro Thr Asp Proyr Glu Val Ile Pro                                     #         45                                                                   #Ile Pro Thr Val His Phero Thr Pro Pro Ser                                     #     60                                                                       #Val Asn Lys Glu Ile Argla Val Gln Pro Gln                                     # 80                                                                           - (2) INFORMATION FOR SEQ ID NO:6:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 130 amino                                                          (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:                                  #Glu Pro Ser Tyr Glu Alalu Pro Ala Pro Val                                     #                 15                                                           #Gln Pro Glu Pro Asn Lyshr Pro Thr Pro Asp                                     #             30                                                               #Thr Pro Pro Thr Asp Proyr Glu Val Ile Pro                                     #         45                                                                   #Ile Pro Thr Val His Phero Thr Pro Pro Ser                                     #     60                                                                       #Val Asn Lys Glu Ile Argla Val Gln Pro Gln                                     # 80                                                                           #Leu Val Ala Lys Gln Sersn Ile Asp Arg Thr                                     #                 95                                                           #Leu Pro Ala Gly Arg Aspeu Lys Thr Ala Asp                                     #            110                                                               #Leu Pro Ser Gly Tyr Glnal Leu Val Asp Pro                                     #        125                                                                   -  Phe Asn                                                                          130                                                                       - (2) INFORMATION FOR SEQ ID NO:7:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 52 amino                                                           (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:                                  #Ser Leu Asn Gly Lys Ilero Asn Ile Trp Tyr                                     #                 15                                                           #Glu Lys Pro Thr Pro Proro Lys Val Thr Lys                                     #             30                                                               #Tyr Glu Thr Glu Lys Proro Thr Lys Pro Thr                                     #         45                                                                   -  Leu Lys Pro Ala                                                                  50                                                                        - (2) INFORMATION FOR SEQ ID NO:8:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 70 amino                                                           (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:                                  #Thr Val His Phe His Tyrro Pro Ser Ile Pro                                     #                 15                                                           #Lys Glu Ile Arg Asn Asnln Pro Gln Val Asn                                     #             30                                                               #Ala Lys Gln Ser Val Valsp Arg Thr Leu Val                                     #         45                                                                   #Ala Gly Arg Asp Glu Thrhr Ala Asp Leu Pro                                     #     60                                                                       -  Thr Ser Phe Val Leu Val                                                     # 70                                                                           - (2) INFORMATION FOR SEQ ID NO:9:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 21 amino                                                           (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:                                  #Gly Arg Asp Glu Thr Thrla Asp Leu Pro Ala                                     #                 15                                                           -  Ser Phe Val Leu Val                                                                      20                                                                - (2) INFORMATION FOR SEQ ID NO:10:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 312 amino                                                          (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:                                 #Ser Leu Asn Gly Lys Ilero Asn Ile Trp Tyr                                     #                 15                                                           #Glu Lys Pro Thr Pro Proro Lys Val Thr Lys                                     #             30                                                               #Tyr Glu Thr Glu Lys Proro Thr Lys Pro Thr                                     #         45                                                                   #Glu Lys Glu Pro Thr Proal Ala Pro Asn Tyr                                     #     60                                                                       #Lys Lys Pro Thr Pro Prosp Gln Ala Glu Pro                                     # 80                                                                           #Ala Pro Val Glu Pro Serys Pro Leu Glu Pro                                     #                 95                                                           #Thr Pro Asp Gln Ala Gluhr Pro Pro Thr Arg                                     #            110                                                               #Thr Glu Lys Pro Leu Gluro Pro Thr Tyr Glu                                     #        125                                                                   #Glu Pro Thr Pro Pro Thrro Ser Tyr Glu Ala                                     #    140                                                                       #Pro Val Glu Pro Thr Tyrro Glu Pro Asn Lys                                     #160                                                                           #Val Tyr Gln Asp Leu Proro Pro Thr Asp Pro                                     #                175                                                           #His Tyr Phe Lys Leu Alaro Thr Val His Phe                                     #            190                                                               #Asn Asn Asn Asp Val Asnsn Lys Glu Ile Arg                                     #        205                                                                   #Val Val Lys Phe Gln Leual Ala Lys Gln Ser                                     #    220                                                                       #Glu Thr Thr Ser Phe Valro Ala Gly Arg Asp                                     #240                                                                           #Phe Asn Pro Glu Ala Thrro Ser Gly Tyr Gln                                     #                255                                                           #Tyr Asp Asn Ala Thr Asnly Phe Asp Val Ala                                     #            270                                                               #Leu Ala Thr Phe Asn Alala Thr Ala Ala Thr                                     #        285                                                                   #Pro Thr Val Val Gly Glnal Ala Thr Ile Tyr                                     #    300                                                                       -  Val Leu Asn Asp Gly Ala Thr Tyr                                             #310                                                                           - (2) INFORMATION FOR SEQ ID NO:11:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 30 amino                                                           (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:                                 #Thr Val His Phe His Tyrro Pro Ser Ile Pro                                     #                 15                                                           #Lys Glu Ile Argla Val Gln Pro Gln Val Asn                                     #             30                                                               - (2) INFORMATION FOR SEQ ID NO:12:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 570 base                                                           (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:                                 #GCCAACGCCA    60TGGAGCC AGCACCTGTT GAGCCAAGCT ATGAAGCAGA                      #TTATGAGGTT   120CAGATCA ACCAGAACCA AACAAACCTG TTGAGCCAAC                      #ATCTATACCA   180CGACTGA TCCTGTTTAT CAAGATCTTC CAACACCTCC                      #AGAAATTAGA   240ATTACTT TAAACTAGCT GTTCAGCCGC AGGTTAACAA                      #TGTTAAGTTC   300TTAATAT TGACAGAACT TTGGTGGCTA AACAATCTGT                      #TGTCTTGGTA   360CAGATCT CCCTGCTGGA CGTGATGAAA CAACTTCCTT                      #CAGCCCTGGC   420CTGGTTA TCAATTTAAT CCTGAAGCTA CAAAAGCTGC                      #TGCAGCAACT   480ATGATAA TGCAACTAAT ACAGTCACCT TCAAGGCAAC                      #AACAGTGGTC   540ATGCTGA TTTGACTAAG TCAGTGGCAA CGATTTATCC                      #          570     GATGG CGCAACTTAT                                            - (2) INFORMATION FOR SEQ ID NO:13:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 120 base                                                           (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:                                 #TGTTAAGTTC    60TTAATAT TGACAGAACT TTGGTGGCTA AACAATCTGT                      #TGTCTTGGTA   120CAGATCT CCCTGCTGGA CGTGATGAAA CAACTTCCTT                      - (2) INFORMATION FOR SEQ ID NO:14:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 60 base                                                            (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:                                 #AACAGTGGTC    60ATGCTGA TTTGACTAAG TCAGTGGCAA CGATTTATCC                      - (2) INFORMATION FOR SEQ ID NO:15:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 330 base                                                           (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:                                 #TGTTAAGTTC    60TTAATAT TGACAGAACT TTGGTGGCTA AACAATCTGT                      #TGTCTTGGTA   120CAGATCT CCCTGCTGGA CGTGATGAAA CAACTTCCTT                      #CAGCCCTGGC   180CTGGTTA TCAATTTAAT CCTGAAGCTA CAAAAGCTGC                      #TGCAGCAACT   240ATGATAA TGCAACTAAT ACAGTCACCT TCAAGGCAAC                      #AACAGTGGTC   300ATGCTGA TTTGACTAAG TCAGTGGCAA CGATTTATCC                      #          330     GATGG CGCAACTTAT                                            - (2) INFORMATION FOR SEQ ID NO:16:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 240 base                                                           (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:                                 #GCCAACGCCA    60TGGAGCC AGCACCTGTT GAGCCAAGCT ATGAAGCAGA                      #TTATGAGGTT   120CAGATCA ACCAGAACCA AACAAACCTG TTGAGCCAAC                      #ATCTATACCA   180CGACTGA TCCTGTTTAT CAAGATCTTC CAACACCTCC                      #AGAAATTAGA   240ATTACTT TAAACTAGCT GTTCAGCCGC AGGTTAACAA                      - (2) INFORMATION FOR SEQ ID NO:17:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 390 base                                                           (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:                                 #GCCAACGCCA    60TGGAGCC AGCACCTGTT GAGCCAAGCT ATGAAGCAGA                      #TTATGAGGTT   120CAGATCA ACCAGAACCA AACAAACCTG TTGAGCCAAC                      #ATCTATACCA   180CGACTGA TCCTGTTTAT CAAGATCTTC CAACACCTCC                      #AGAAATTAGA   240ATTACTT TAAACTAGCT GTTCAGCCGC AGGTTAACAA                      #TGTTAAGTTC   300TTAATAT TGACAGAACT TTGGTGGCTA AACAATCTGT                      #TGTCTTGGTA   360CAGATCT CCCTGCTGGA CGTGATGAAA CAACTTCCTT                      #          390     GGTTA TCAATTTAAT                                            - (2) INFORMATION FOR SEQ ID NO:18:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 156 base                                                           (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:                                 #TGCGGTTAAT    60AACCAAA TATTTGGTAT TCATTAAATG GTAAAATCCG                      #TCCAACTAAA   120CTAAGGA AAAACCCACA CCTCCGGTTA AACCAACAGC                      #      156         GAAAA GCCATTAAAA CCGGCA                                     - (2) INFORMATION FOR SEQ ID NO:19:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 210 base                                                           (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:                                 #TAAACTAGCT    60CACCTCC ATCTATACCA ACTGTTCATT TCCATTACTT                      #TGACAGAACT   120TTAACAA AGAAATTAGA AACAATAACG ATGTTAATAT                      #CCCTGCTGGA   180AATCTGT TGTTAAGTTC CAGCTGAAGA CAGCAGATCT                      #          210     TCCTT TGTCTTGGTA                                            - (2) INFORMATION FOR SEQ ID NO:20:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 63 base                                                            (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:                                 #CTTTGTCTTG    60CAGCAGA TCTCCCTGCT GGACGTGATG AAACAACTTC                      #             63                                                               - (2) INFORMATION FOR SEQ ID NO:21:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 936 base                                                           (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:                                 #TGCGGTTAAT    60AACCAAA TATTTGGTAT TCATTAAATG GTAAAATCCG                      #TCCAACTAAA   120CTAAGGA AAAACCCACA CCTCCGGTTA AACCAACAGC                      #TTATGAAAAG   180CAGAAAA GCCATTAAAA CCGGCACCAG TAGCTCCAAA                      #CACTCCGCCG   240CGACAAG AACACCGGAT CAAGCAGAGC CAAAGAAACC                      #TGAAGCAGAG   300AAAAGCC GTTGGAGCCA GCACCTGTTG AGCCAAGCTA                      #ACCGCCGACC   360CAAGGAC ACCGGATCAG GCAGAGCCAA ATAAACCCAC                      #AGCAGAGCCA   420AGCCGTT GGAGCCAGCA CCTGTTGAGC CAAGCTATGA                      #GCCAACTTAT   480CAACACC AGATCAACCA GAACCAAACA AACCTGTTGA                      #ACCTCCATCT   540CACCGCC GACTGATCCT GTTTATCAAG ATCTTCCAAC                      #TAACAAAGAA   600ATTTCCA TTACTTTAAA CTAGCTGTTC AGCCGCAGGT                      #ATCTGTTGTT   660ACGATGT TAATATTGAC AGAACTTTGG TGGCTAAACA                      #TTCCTTTGTC   720AGACAGC AGATCTCCCT GCTGGACGTG ATGAAACAAC                      #AGCTGCCAGC   780TGCCATC TGGTTATCAA TTTAATCCTG AAGCTACAAA                      #GGCAACTGCA   840TCGCTTA TGATAATGCA ACTAATACAG TCACCTTCAA                      #TTATCCAACA   900CGTTTAA TGCTGATTTG ACTAAGTCAG TGGCAACGAT                      #      936         CTTAA TGATGGCGCA ACTTAT                                     - (2) INFORMATION FOR SEQ ID NO:22:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 90 base                                                            (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:                                 #TAAACTAGCT    60CACCTCC ATCTATACCA ACTGTTCATT TCCATTACTT                      #           90     AACAA AGAAATTAGA                                            - (2) INFORMATION FOR SEQ ID NO:23:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 1561 amino                                                         (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: protein                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23:                                 #Lys Ser Lys Ile Ser Lyshr Tyr Gly Phe Arg                                     #                 15                                                           #Ala Ala Val Ser Val Alaal Leu Gly Thr Val                                     #             30                                                               #Thr Thr Ser Asp Val Aspla Asp Glu Thr Thr                                     #         45                                                                   #Pro Ala Thr Asn Leu Prohr Gln Thr Gly Asn                                     #     60                                                                       #Glu Gln Ser Gln Thr Lysla Ser Lys Gln Ala                                     # 80                                                                           #Val Pro Lys Thr Asp Leual His Thr Ile Glu                                     #                 95                                                           #Gly Val Asn Val Val Glnsp Ala Lys Ser Ala                                     #            110                                                               #Thr Ala Glu Glu Ala Valys Gly Thr Val Lys                                     #        125                                                                   #Thr Lys Gln Ala Glu Asple Lys Glu Asp Tyr                                     #    140                                                                       #Asp Val Ala Ala His Glusp Gln Tyr Lys Ser                                     #160                                                                           #Gln Ala Thr Lys Glu Glnle Lys Ala Lys Asn                                     #                175                                                           #Glu Val Glu Arg Ile Asnal Ala His Lys Ala                                     #            190                                                               #Glu Ala Lys Leu Ala Glner Lys Thr Ala Tyr                                     #        205                                                                   #Thr Asn Ala Ala Asn Glnla Ala Val Gln Lys                                     #    220                                                                       #Gln Ala Glu Leu Lys Argla Leu Ala Ala Tyr                                     #240                                                                           #Ala Tyr Asp Thr Ala Valla Ala Ala Lys Ala                                     #                255                                                           #Ala Ala Ala Asn Glu Gluys Asn Thr Glu Ile                                     #            270                                                               #Glu Tyr Glu Thr Lys Leula Thr Ala Lys Ala                                     #        285                                                                   #Gln Glu Ala Asn Ala Alalu Leu Lys Arg Val                                     #    300                                                                       #Ala Tyr Gln Thr Glu Leuln Ala Lys Leu Thr                                     #320                                                                           #Lys Ala Ala Tyr Glu Alala Asn Ala Asp Ala                                     #                335                                                           #Ala Leu Thr Ala Glu Asnsn Ala Lys Asn Ala                                     #            350                                                               #Lys Ala Thr Tyr Glu Alarg Asn Glu Asn Ala                                     #        365                                                                   #Ala Val Lys Lys Ala Asnlu Ala Asp Leu Ala                                     #    380                                                                       #Leu Thr Ala Tyr Gln Thrsp Tyr Gln Ala Lys                                     #400                                                                           #Asp Ala Lys Ala Ala Tyrln Lys Ala Asn Ala                                     #                415                                                           #Asn Ala Ala Leu Thr Alala Asn Asn Ala Ala                                     #            430                                                               #Asp Ala Lys Ala Asp Tyrys Lys Arg Asn Ala                                     #        445                                                                   #Leu Ala Lys Tyr Gln Lysys Tyr Gln Ala Asp                                     #    460                                                                       #Ala Tyr Glu Asp Glu Glnro Val Lys Leu Lys                                     #480                                                                           #Glu Lys His Lys Asn Glula Leu Ala Glu Leu                                     #                495                                                           #Asn Leu Val Tyr Asp Leulu Pro Ser Ala Gln                                     #            510                                                               #Asp Gly Lys Phe Leu Lyseu Ser Leu Thr Thr                                     #        525                                                                   #Ser Thr Ser Lys Ala Lyssp Ala Phe Ser Lys                                     #    540                                                                       #Leu Asp Ile Thr Asn Leueu Gln Leu Asp Asp                                     #560                                                                           #Glu Leu Tyr Gly Asn Pheal Ala Ser Ser Met                                     #                575                                                           #Ser Asn Asn Ser Gln Valrp Ser Thr Thr Val                                     #            590                                                               #Gln Ser Ala Thr Ala Threu Leu Glu Arg Gly                                     #        605                                                                   #Gly Lys Lys Ile Ser Lyssn Ser Tyr Tyr Asn                                     #    620                                                                       #Ser Lys Phe Gln Gly Glnhr Val Asp Pro Lys                                     #640                                                                           #Thr Leu Gly Val Phe Alale Phe Thr Asp Pro                                     #                655                                                           #Thr Ser Ile Phe Ile Lysln Val Glu Lys Asn                                     #            670                                                               #Lys Pro Ile Asn Phe Aspyr Asp Glu Asp Gly                                     #        685                                                                   #Arg Glu Asn Asn Ser Ileal Ala Ser Leu Asn                                     #    700                                                                       #Val Lys Ile Ser Gly Seryr Thr Gly Lys Phe                                     #720                                                                           #Ala Thr Asp Thr Leu Asnsn Gly Met Ile Tyr                                     #                735                                                           #Thr Met Tyr Thr Arg Alaly Gly Ala Arg Trp                                     #            750                                                               #Asp Ala Pro Asn Ser Trply Trp Asp Ser Ser                                     #        765                                                                   #Pro Asn Asn Ser Val Thrle Arg Met Ser Gly                                     #    780                                                                       #Pro Ala Asp Pro Thr Meter Thr Leu Val Val                                     #800                                                                           #Trp Tyr Ser Leu Asn Glyys Lys Pro Asn Ile                                     #                815                                                           #Thr Lys Glu Lys Pro Thrsn Leu Pro Lys Val                                     #            830                                                               #Pro Thr Tyr Glu Thr Gluhr Ala Pro Thr Lys                                     #        845                                                                   #Asn Tyr Glu Lys Glu Prola Pro Val Ala Pro                                     #    860                                                                       #Glu Pro Lys Lys Pro Thrhr Pro Asp Gln Ala                                     #880                                                                           #Glu Pro Ala Pro Val Gluhr Glu Lys Pro Leu                                     #                895                                                           #Thr Arg Thr Pro Asp Glnlu Pro Thr Pro Pro                                     #            910                                                               #Tyr Glu Thr Glu Lys Proro Thr Pro Pro Thr                                     #        925                                                                   #Glu Ala Glu Pro Thr Proal Glu Pro Ser Tyr                                     #    940                                                                       #Asn Lys Pro Val Glu Prosp Gln Pro Glu Pro                                     #960                                                                           #Asp Pro Val Tyr Gln Aspro Thr Pro Pro Thr                                     #                975                                                           #His Phe His Tyr Phe Lyser Ile Pro Thr Val                                     #            990                                                               #Ile Arg Asn Asn Asn Aspln Val Asn Lys Glu                                     #       10050                                                                  #Gln Ser Val Val Lys Phehr Leu Val Ala Lys                                     #   10205                                                                      #Arg Asp Glu Thr Thr Sersp Leu Pro Ala Gly                                     #1030                103 - #5                104                               #Tyr Gln Phe Asn Pro Gluro Leu Pro Ser Gly                                     #               10555 - #                1050                                  #Val Ala Tyr Asp Asn Alaer Pro Gly Phe Asp                                     #           10705                                                              #Ala Thr Leu Ala Thr Phehe Lys Ala Thr Ala                                     #       10850                                                                  #Ile Tyr Pro Thr Val Valys Ser Val Ala Thr                                     #   11005                                                                      #Lys Asn Asn Phe Ser Leusp Gly Ala Thr Tyr                                     #1110                111 - #5                112                               #Asn Val Val Arg Val Thryr Gly Ile Lys Ser                                     #               11355 - #                1130                                  #Pro Asn Asn Asn Tyr Ilesn Asp Pro Asp Asn                                     #           11505                                                              #Gly Val Val Ile Asp Glysn Lys Asn Glu Asn                                     #       11650                                                                  #Tyr Glu Leu Thr Trp Asply Ser Thr Asn Tyr                                     #   11805                                                                      #Ala Asp Thr Ile Gln Glnsn Asp Arg Ser Ser                                     #1190                119 - #5                120                               #Glu Ala Leu Glu Leu Argsp Asp Tyr Pro Glu                                     #               12155 - #                1210                                  #Gly Asn Glu Val Thr Glyle Thr Asp Ala Asn                                     #           12305                                                              #Ala Ala Pro Gln Glu Ileyr Thr Ser Leu Glu                                     #       12450                                                                  #Pro Lys Gly Ala Phe Glnys Ala Gly Ile Arg                                     #   12605                                                                      #Tyr Asp Thr Tyr Val Lyssn Pro Arg Glu Phe                                     #1270                127 - #5                128                               #Met Val Val Lys Lys Glnys Ile Val Ser Pro                                     #               12955 - #                1290                                  #Gln Ala Tyr Gln Ile Asply Ser Tyr Glu Asp                                     #           13105                                                              #Ile Asn Asn Val Pro Lysla Ser Asn Ile Val                                     #       13250                                                                  #Leu Asp Pro Ala Asp Thrsp Val Thr Leu Thr                                     #   13405                                                                      #Asn Thr Val Phe Asn Tyrln Thr Ile Pro Leu                                     #1350                135 - #5                136                               #His Ser Glu Glu Leu Phele Ile Pro Ala Asn                                     #               13755 - #                1370                                  #Thr Gly Asp His Tyr Thrsp Asp Tyr Asp Gln                                     #           13905                                                              #Ile Thr Leu Lys Asn Glyhe Ala Lys Val Asp                                     #       14050                                                                  #Gln Tyr Thr Thr Ala Gluly Thr Glu Leu Thr                                     #   14205                                                                      #Lys Phe Lys Glu Ala Phely Ala Ile Thr Ile                                     #1430                143 - #5                144                               #Gln Ala Glu Ser Tyr Ilele Asp Ser Ala Phe                                     #               14555 - #                1450                                  #Glu Asn Thr Tyr Ile Asnla Val Gly Thr Phe                                     #           14705                                                              #Thr Val Lys Thr Thr Thrhr Tyr Ser Ser Asn                                     #       14850                                                                  #Gln Asp Pro Ser Ser Prosp Pro Thr Asp Pro                                     #   15005                                                                      #Gln Ser Thr Ala Tyr Glnle Ile Tyr Lys Pro                                     #1510                151 - #5                152                               #Thr Gly Val Thr Asn Asnys Thr Leu Pro Asn                                     #               15355 - #                1530                                  #Leu Val Thr Ser Phe Sereu Gly Ile Ile Gly                                     #           15505                                                              -  Leu Leu Gly Leu Lys Ala Lys Lys Asp                                         #        1560                                                                  - (2) INFORMATION FOR SEQ ID NO:24:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 4865 base                                                          (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24:                                 #GTATATTACA    60TTGACAA ATCAAATCAA TTATATTACA ATTTTTTAAC                      #AAAAACTTAC   120AAGATTT ATTCAGATTT GGAGGATTTA TGAAAGTCAA                      #AACAGTAGCA   180GTAAAAT TAGTAAAACA CTGTGTGGTG CTGTTCTAGG                      #TAGTGATGTA   240CAGGACA AAAGGTTTTT GCCGATGAAA CGACCACTAC                      #AGAGGCTCAA   300TTGGAAC ACAAACTGGA AATCCAGCGA CCAATTTGCC                      #AATGGTTCAT   360AGCAAGC TGAACAAAGT CAAACCAAGC TGGAGAGACA                      #GTCTGCTGGT   420CTAAAAC TGATCTTGAT CAAGCAGCAA AAGATGCTAA                      #TGAAGAAGCA   480AAGATGC CGATGTTAAT AAAGGAACTG TTAAAACAGC                      #TATTAAGAAG   540CTGAAAT TAAAGAAGAT TACACAAAAC AAGCTGAGGA                      #TAAAATCAAA   600ATAAATC GGATGTAGCT GCTCATGAGG CAGAAGTTGC                      #TAAAGCCGAG   660CAACTAA AGAACAGTAT GGAAAAGATA TGGTAGCTCA                      #TAAATTGGCT   720ATGCTGC AAATGCTGCC AGTAAAACAG CTTATGAAGC                      #AGCATCCTAT   780ATTTAGC AGCCGTTCAA AAAACCAATG CTGCCAATCA                      #TAATGCAGCC   840CTGCTTA TCAGGCTGAA CTGAAACGTG TTCAGGAAGC                      #AGAAATTGCC   900ATGATAC TGCTGTAGCA GCAAATAATG CCAAAAATAC                      #TGAGACTAAG   960AAATTAG AAAACGCAAT GCAACGGCCA AAGCTGAATA                      #AAACGAAGCA  1020AAGCTGA ACTAAAGCGT GTTCAGGAAG CTAATGCCGC                      #GAAAGCCAAT  1080AATTGAC CGCCTATCAA ACAGAGCTTG CTCGCGTTCA                      #AAATGCGGCA  1140CGGCCTA TGAAGCAGCT GTAGCAGCAA ATAATGCCAA                      #GACTTATGAA  1200ATACTGC AATTAAGCAA CGCAATGAGA ATGCTAAGGC                      #TGCCGCAAAC  1260AATATGA GGCTGATTTG GCAGCGGTGA AAAAAGCTAA                      #CGTTCAAAAG  1320AAGCTAA ATTGACCGCC TATCAAACAG AGCTCGCTCG                      #TGCCGCAAAT  1380CTAAAGC GGCCTATGAA GCAGCTGTAG CAGCAAATAA                      #TAAAGCTGAT  1440CTGAAAA TACTGCAATT AAGAAGCGCA ATGCGGATGC                      #AGATTTAGCA  1500TTGCTAA GTATCAAGCA GATCTTGCCA AATATCAAAA                      #AGCTGCACTG  1560AGTTAAA GGCATACGAA GATGAACAAG CTTCTATTAA                      #TGCTCAAAAT  1620AACATAA AAATGAAGAC GGAAACTTAA CAGAACCATC                      #GAAGTTCCTT  1680TTGAGCC AAATGCGAAC TTATCTTTGA CAACAGATGG                      #ATATGACCAA  1740TGGATGA TGCTTTTAGC AAAAGCACTT CAAAAGCAAA                      #TGATGTTGCT  1800TAGATGA TCTAGATATC ACTAACTTAG AACAATCTAA                      #GACAGTAAGC  1860TTTATGG CAATTTTGGT GATAAAGCTG GCTGGTCAAC                      #CGCAACAGCT  1920TTAAATG GGGATCGGTA CTTTTAGAGC GCGGTCAAAG                      #AGGTTTGGTT  1980TGCAGAA TTCTTATTAC GTCCAAGTTT CAAGGTCAAA                      #AAGTATACAG  2040GATCCAA AATGGTAAAA AGATTTCTAA AATTGTCTAC                      #AAAAAACACT  2100TAGGTGT TTTTGCTTCC GCTTATACAG GTCAAGTTGA                      #AATTAATTTT  2160AAAATGA ATTCACTTTC TATGACGAAG ATGGAAAACC                      #TGAGATGGCC  2220TATCAGT AGCTTCTCTT AACCGAGAAA ATAATTCTAT                      #AAAGAATGGC  2280GTAAATT TGTCAAAATC TCTGGATCAT CTATCGGTGA                      #TCGTTGGACC  2340CAGATAC TCTCAACTTT AGGCAGGGTC AAGGTGGTGC                      #GCCTAACTCT  2400CTAGCGA ACCGGGATCT GGCTGGGATA GTTCAGATGC                      #TTTGGGTGCT  2460GTGCTAT CCGCATGTCT GGTCCTAATA ACAGTGTGAC                      #CGGCAAAAAA  2520TTGTTGT GCCTGCTGAT CCTACAATGG CAATTGAAAC                      #TAAAGTTACT  2580ATTCATT AAATGGTAAA ATCCGTGCGG TTAATCTTCC                      #TTATGAAACA  2640CACCTCC GGTTAAACCA ACAGCTCCAA CTAAACCAAC                      #AACACCACCG  2700AACCGGC ACCAGTAGCT CCAAATTATG AAAAGGAGCC                      #TGAAACAGAA  2760ATCAAGC AGAGCCAAAG AAACCCACTC CGCCGACCTA                      #ACCGCCGACA  2820CAGCACC TGTTGAGCCA AGCTATGAAG CAGAGCCAAC                      #AACAGAAAAG  2880AGGCAGA GCCAAATAAA CCCACACCGC CGACCTATGA                      #ACCGACACCA  2940CACCTGT TGAGCCAAGC TATGAAGCAG AGCCAACGCC                      #TATTCCAACA  3000CAGAACC AAACAAACCT GTTGAGCCAA CTTATGAGGT                      #AACTGTTCAT  3060CTGTTTA TCAAGATCTT CCAACACCTC CATCTATACC                      #AAACAATAAC  3120AACTAGC TGTTCAGCCG CAGGTTAACA AAGAAATTAG                      #CCAGCTGAAG  3180ACAGAAC TTTGGTGGCT AAACAATCTG TTGTTAAGTT                      #AGATCCCCTG  3240CTGCTGG ACGTGATGAA ACAACTTCCT TTGTCTTGGT                      #CTTTGATGTC  3300AATTTAA TCCTGAAGCT ACAAAAGCTG CCAGCCCTGG                      #TTTGGCTACG  3360CAACTAA TACAGTCACC TTCAAGGCAA CTGCAGCAAC                      #CGGACAAGTT  3420TGACTAA GTCAGTGGCA ACGATTTATC CAACAGTGGT                      #TGCTTATGGC  3480CAACTTA TAAGAATAAT TTCTCGCTCA CAGTCAATGA                      #AGATAACCCA  3540TTGTTCG GGTGACAACT CCTGGTAAAC CAAATGATCC                      #TGTTATTGAT  3600TTAAGCC AACTAAGGTT AATAAAAATG AAAATGGCGT                      #TTTGGATCAA  3660TTGCCGG TTCAACGAAT TATTATGAGC TAACTTGGGA                      #TGTAGATGAT  3720GCTCTTC AGCAGATACC ATTCAACAAG GATTTTACTA                      #TGCTAATGGC  3780CGCTTGA ATTGCGTCAG GATTTAGTGA AGATTACAGA                      #CCCTCAAGAA  3840GTGTTAG TGTGGATAAT TATACTAGTC TTGAAGCAGC                      #AATTTTCCGT  3900TTTCTAA GGCAGGAATT AGACCTAAAG GTGCTTTCCA                      #TTTGAAGATT  3960GAGAATT TTATGATACT TATGTCAAAA CTGGAATTGA                      #TGAAGATCAA  4020TTGTTAA AAAACAAATG GGACAAACAG GCGGGAGTTA                      #TAATGTTCCT  4080ACTTTGG TAATGGTTAT GCATCAAATA TCGTTATCAA                      #AAATAATGTT  4140AGAAAGA TGTGACCTTA ACACTTGATC CGGCTGATAC                      #TGGCATTATC  4200TTCCACT TAATACAGTC TTTAATTACC GTTTGATTGG                      #TGATCAAACA  4260CAGAAGA ACTCTTTGAA TACAATTTCT ATGATGATTA                      #TCTTAAAAAC  4320CTGGTCA GTATAAAGTT TTTGCCAAGG TTGATATCAC                      #AGTTGATACC  4380AGTCAGG TACTGAGTTA ACTCAGTATA CGACAGCGGA                      #TTCAATTGAT  4440TCACAAT TAAGTTCAAG GAAGCCTTTC TGCGTTCTGT                      #TACTTTTGAA  4500CTGAAAG TTATATCCAA ATGAAACGTA TTGCGGTTGG                      #GAAAACAACT  4560ATACTGT CAATGGGGTA ACTTACAGTT CAAATACAGT                      #GCGGACTTCA  4620CTGCAGA CCCTACTGAT CCGCAAGATC CATCATCACC                      #CCAAAAAACG  4680ACAAACC TCAATCAACT GCTTATCAAC CAAGCTCTGT                      #TATTGGCTTA  4740GAGTAAC AAACAATGCT TATATGCCTT TACTTGGTAT                      #TAGATATTAC  4800GTTTGCT TGGCTTAAAG GCTAAGAAAG ATTGACAGCA                      #TCTCATTTTT  4860AGTGAGA TGAAGCGATA AATCACAGAT TGAGCTTTTA                      #          4865                                                                - (2) INFORMATION FOR SEQ ID NO:25:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 33 base                                                            (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25:                                 #         33       GTTCA TTTCCATTAC TTT                                        - (2) INFORMATION FOR SEQ ID NO:26:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 36 base                                                            (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: Genomic DNA                                          -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26:                                 #       36         TTCAT TTTTATTAAC CTTAGT                                     - (2) INFORMATION FOR SEQ ID NO:27:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 20 amino                                                           (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27:                                 #Ile Ile Asn Glu Glu Alaln Ile Ile Arg Asp                                     #                 15                                                           -  Ala Asp Trp Asp                                                                          20                                                                __________________________________________________________________________ 

We claim:
 1. A polypeptide or an extended polypeptide, wherein said extended polypeptide is extended at the N-terminus or C-terminus or both with non-wild-type amino acid sequence to form said extended polypeptide;wherein said polypeptide is selected from the group consisting of a polypeptide consisting of an amino acid sequence corresponding to residues 925-1114 of the Streptococcus mutans antigen I/II (SA I/II) (SEQ ID NO: 1); a polypeptide consisting of an amino acid sequence corresponding to residues 1005-1044 of SA I/II (SEQ ID NO:2); a polypeptide consisting of an amino acid sequence corresponding to residues 1085-1104 of SA I/II (SEQ ID NO:3); a polypeptide consisting of an amino acid sequence corresponding to residues 1005-1114 of SA I/II (SEQ ID NO:4); a polypeptide consisting of an amino acid sequence corresponding to residues 925-1004 of SA I/II (SEQ ID NO:5); a polypeptide consisting of an amino acid sequence corresponding to residues 925-1054 of SA I/II (SEQ ID NO:6); a polypeptide consisting of an amino acid sequence corresponding to residues 803-854 of SA I/II (SEQ ID NO:7); a polypeptide consisting of an amino acid sequence corresponding to residues 975-1044 of SA I/II (SEQ ID NO:8); a polypeptide consisting of an amino acid sequence corresponding to residues 1024-1044 of SA I/II (SEQ ID NO:9); a polypeptide consisting of an amino acid sequence corresponding to residues 1025-1044 of SA I/II (residues 2-21 of SEQ ID NO:9); a polypeptide consisting of an amino acid sequence corresponding to residues 804-1114 of SA I/II (SEQ ID NO:10); a polypeptide consisting of an amino acid sequence corresponding to residues 975-1004 of SA I/II (SEQ ID NO:11); and a polypeptide which differs from any of the aforesaid polypeptides by up to and including 8 amino acid alterations wherein said alterations consist of the substitution and/or deletion and/or insertion of up to and including 8 amino acids and having the same immunological and adhesion properties as said corresponding sequence of any one of the aforesaid polypeptides; and wherein said polypeptide or extended polypeptide may be in the N-terminal acylated and/or C-terminal amidated form.
 2. The polypeptide or extended polypeptide of claim 1 which has the amino acid sequence corresponding to SEQ ID NO:9 or differs from said sequence by up to and including 8 amino acid alterations wherein said alterations consist of substitution and/or insertion and/or deletion of 1, 2, 3, 4, 5 or 8 amino acids.
 3. The polypeptide or extended polypeptide of claim 2 wherein said polypeptide has the amino acid sequence of SEQ ID NO:9 or differs from said amino acid sequence of SEQ ID NO:9 by virtue of substitution, deletion or insertion of one amino acid.
 4. A pharmaceutical composition comprising the polypeptide or extended polypeptide of claim 1 in a pharmaceutically acceptable carrier.
 5. A pharmaceutical composition comprising the polypeptide or extended polypeptide of claim 3 in a pharmaceutically acceptable carrier.
 6. An immunological composition comprising the polypeptide or extended polypeptide of claim 1 along with an immunologically acceptable carrier.
 7. An immunological composition comprising the polypeptide or extended polypeptide of claim 3 along with an immunologically acceptable carrier.
 8. The composition of claim 4 which is formulated for topical application in the mouth.
 9. The composition of claim 5 which is formulated for topical application in the mouth.
 10. A method to vaccinate or treat a mammalian host against dental caries which method comprises administering to the host an effective amount of the polypeptide or extended polypeptide of claim
 1. 11. A method to vaccinate or treat a mammalian host against dental caries which method comprises administering to the host an effective amount of the polypeptide or extended polypeptide of claim
 3. 12. The method of claim 10 wherein the polypeptide or extended polypeptide is administered by topical application in the mouth.
 13. The method of claim 11 wherein the polypeptide or extended polypeptide is administered by topical application in the mouth.
 14. The polypeptide or extended polypeptide of claim 1 wherein said polypeptide consists of the amino acid sequence corresponding to residues 1025-1044 of SA I/II (residues 2-21 of SEQ ID NO:9).
 15. A pharmaceutical composition comprising the polypeptide or extended polypeptide of claim 14 in a pharmaceutically acceptable carrier.
 16. An immunological composition comprising the polypeptide or extended polypeptide of claim 14 along with an immunologically acceptable carrier.
 17. The composition of claim 15 which is formulated for topical application in the mouth.
 18. A method to vaccinate or treat a mammalian host against dental caries which method comprises administering to the host an effective amount of the polypeptide or extended polypeptide of claim
 14. 19. The method of claim 18 wherein the polypeptide or extended polypeptide is administered by topical application in the mouth.
 20. The polypeptide or extended polypeptide of claim 1 wherein said polypeptide is selected from the group consisting ofa polypeptide consisting of an amino acid sequence corresponding to residues 925-1114 of the Streptococcus mutans antigen I/II (SA I/II) (SEQ ID NO:1); a polypeptide consisting of an amino acid sequence corresponding to residues 1005-1044 of SA I/II (SEQ ID NO:2); a polypeptide consisting of an amino acid sequence corresponding to residues 1085-1104 of SA I/II (SEQ ID NO:3); a polypeptide consisting of an amino acid sequence corresponding to residues 1005-1114 of SA I/lI (SEQ ID NO:4); a polypeptide consisting of an amino acid sequence corresponding to residues 925-1004 of SA I/II (SEQ ID NO:5); a polypeptide consisting of an amino acid sequence corresponding to residues 925-1054 of SA I/II (SEQ ID NO:6); a polypeptide consisting of an amino acid sequence corresponding to residues 803-854 of SA I/II (SEQ ID NO:7); a polypeptide consisting of an amino acid sequence corresponding to residues 975-1044 of SA I/II (SEQ ID NO:8); a polypeptide consisting of an amino acid sequence corresponding to residues 1024-1044 of SA I/II (SEQ ID NO:9); a polypeptide consisting of an amino acid sequence corresponding to residues 1025-1044 of SA I/II (residues 2-21 of SEQ ID NO:9); a polypeptide consisting of an amino acid sequence corresponding to residues 804-1114 of SA I/II (SEQ ID NO:10); and a polypeptide consisting of an amino acid sequence corresponding to residues 975-1004 of SA I/II (SEQ ID NO:11).
 21. The polypeptide or extended polypeptide of claim 20 wherein said polypeptide is selected from the group consisting ofa polypeptide consisting of an amino acid sequence corresponding to residues 1005-1044 of the Streptococcus mutans antigen I/II (SA I/II) (SEQ ID NO:2); a polypeptide consisting of an amino acid sequence corresponding to residues 1085-1104 of SA I/II (SEQ ID NO:3); a polypeptide consisting of an amino acid sequence corresponding to residues 1005-1114 of SA I/II (SEQ ID NO:4); a polypeptide consisting of an amino acid sequence corresponding to residues 925-1004 of SA I/II (SEQ ID NO:5); a polypeptide consisting of an amino acid sequence corresponding to residues 925-1054 of SA I/II (SEQ ID NO:6); a polypeptide consisting of an amino acid sequence corresponding to residues 803-854 of SA I/II (SEQ ID NO:7); a polypeptide consisting of an amino acid sequence corresponding to residues 975-1044 of SA I/II (SEQ ID NO:8); a polypeptide consisting of an amino acid sequence corresponding to residues 1024-1044 of SA I/II (SEQ ID NO:9); a polypeptide consisting of an amino acid sequence corresponding to residues 1025-1044 of SA I/II (residues 2-21 of SEQ ID NO:9); and a polypeptide consisting of an amino acid sequence corresponding to residues 975-1004 of SA I/II (SEQ ID NO:11).
 22. A pharmaceutical composition comprising the polypeptide or extended polypeptide of claim 20 in a pharmaceutically acceptable carrier.
 23. A pharmaceutical composition comprising the polypeptide or extended polypeptide of claim 21 in a pharmaceutically acceptable carrier.
 24. A method to vaccinate or treat a mammalian host against dental caries which method comprises administering to the host an effective amount of the polypeptide or extended polypeptide of claim
 20. 25. A method to vaccinate or treat a mammalian host against dental caries which method comprises administering to the host an effective amount of the polypeptide or extended polypeptide of claim
 21. 26. The method of claim 24 wherein the polypeptide or extended polypeptide is administered by topical application in the mouth.
 27. The method of claim 25 wherein the polypeptide or extended polypeptide is administered by topical application in the mouth.
 28. An immunological composition comprising the polypeptide or extended polypeptide of claim 20 along with an immunologically acceptable carrier.
 29. An immunological composition comprising the polypeptide or extended polypeptide of claim 21 along with an immunologically acceptable carrier.
 30. The composition of claim 22 which is formulated for topical application in the mouth.
 31. The composition of claim 23 which is formulated for topical application in the mouth. 